JPA in distributed Java EE configuration

JPA in distributed Java EE configuration - java

I'm developing a Java EE application to run on Glassfish:
Database (javaDB, MS SQL, MySQL or Oracle)
EJB layer with JPA (Toplink essentials - from Glassfish) for
database access
JSF/Icefaces based web UI accessing the EJB layer
The application will have a lot of concurrent web client, so I want to
run it on different physical servers and use a load-balancer. My
problem is now how to keep the applications synchronized. I intend to
set up multiple servers, each running Glassfish with my EAR app
installed. Whenever on one of the servers data is added to or removed
from the database (via JPA, no direct SQL queries), this change should
be reflected in the JPA layer on the other servers.
I've been looking
around for solutions to this, but couldn't find anything I really like
(the full Toplink from Oracle claims to have a solution, but don't
know). Doing a refresh before every access to a JPA entity could work,
but is far from efficient.
Are there any patterns, libraries, ... that could help here?

Whenever on one of the servers data is added to or removed from the database (via JPA, no direct SQL queries), this change should be reflected in the JPA layer on the other servers.
I don't understand what state is maintained and need to be updated at the "JPA layer". Entity managers are isolated and the way to deal with concurrency in JPA is to use locking (optimistic or pessimistic, the former scaling better). Maybe you could clarify what you need exactly and what you saw.
See also
Java Persistence/Locking

Related

Using Hibernate/JPA with multiple ClassLoaders to access multiple databases

Our application is a middle-tier application that provides a dozen or so front-end application with access to a couple dozen databases (and other data sources) on the back end.
We decided on using OSGi to separate the unrelated bits of code into separate bundles. This ensures proper code encapsulation and even allows for hot-swapping of specific bundles.
One advantage of this is that any code speaking to a specific database is isolated to a single bundle. It also allows us to simply drop in a new bundle for a new destination, and seamlessly integrates the new code. It also ensures that if a single back-end data source is down, that requests to other data sources are unaffected. One complication is that each of those bundles is loaded by a separate ClassLoader.
We'd like to start using JPA for our new destinations that we're building. Previously, we have been using JDBC directly to send SQL queries and updates.
We've looked into Hibernate 4, but it seems that it was built on the assumption that everything is loaded using a single ClassLoader. Switching between ClassLoaders for different bundles does not appear to be something it can handle consistently.
While it seems that Hibernate 5 may have corrected that issue, all the tutorials/documentation I've found for it gloss over the complexities of configuration. Most simply assume you are using a single application-level configuration file, which will not suit our needs at all.
So, my questions are:
Does Hibernate 5 properly handle connecting to multiple databases, with the configuration/POJos for each database loaded by a different ClassLoader?
How do we configure Hibernate to connect to multiple databases using multiple ClassLoaders?
Is there another JPA framework that might be better suited to our specific needs?

Hibernate is fine but for OSGi usage you also need an intermediary. In the OSGi specs this is defined by the OSGi JPA service spec. It defines how to connect to a JPA provider in OSGi without a hard reference to it.
This spec is implemented by Aries JPA. It also provides additional support for blueprint and declarative services. There is also Aries transaction control service that takes similar approach to supporting JPA and transactions in OSGi it also uses the core of Aries JPA but is a bit different in usage.
The last part you might need is pax-jdbc which allows to define a XA datasource just with configuration. The examples already use it.
To get started easily you can use Apache Karaf which has features for all of the above.
Aries JPA allows to use different databases in the same OSGi application.

How to introduce High Availability to a simple JPA application?

I want to have two servers running, both with a small local database accessed via JPA. The 2nd server is a hot standby ready to take over.
How do I keep the databases in sync, and how do I handle caching etc at the JPA layer?
I am curious to know if anyone has tried this, and what technologies they used?

For caching you can use Ehcache. It comes with Hibernate by default if you are using that JPA Persistence Provider. You will need to specify which entities are cacheable using annotations and you can also configure the caching policy for specific entities through XML.
For database replication I wouldn't go for a JPA/JDBC based solution apart from basic IP failover which could be supported by your driver depending on what the database is. Replication is something the database should be responsible for not the application. Most of the popular databases (Postgresql, Mysql, Ms-SQL, Oracle, etc.) nowadays support some kind of replication or clustering, so depends on what you're using.

Java web application memory handling

I have a Java web application which uses Hibernate for storing data into the database and retrieving them.
The strategy I am currently using is to load everything from the database on to the application at start up, and saving/updating them to the database as the user interacts with the application.
What I have also done is to keep track of Transaction history for each user as part of the business logic. (So this transaction history is all loaded on application start up).
The problem I can see is that I shouldn't load all the transaction history for all the user, because if there are a lot of the Transaction history, and users might not necessarily need to see them, then that could be a lot of memory being used up, so it is not efficient.
I was wondering if there is something similar to what PHP script can do, which is just query the database only when user request to see the transaction history, and so it is not using the server resource. (Asides from query the database) Or what are some suggestions/comments regards to what I am facing right now.
Thank you.

Query Hibernate when you need a given piece of information and let Hibernate manage putting it back to the database. This will allow Hibernate to manage the caching.
Note, that when using Hibernate, you should let Hibernate manage the data completely. Do not add or change data yourself using raw SQL.
If you are using a modern container, you should consider migrating to JPA as it is the standard in Java EE containers, allowing you to be more flexible when you need to scale. JPA is very close to Hibernate, but is an API, not an implementation, so you have more than one to choose from.

why not query hibernate for every request come in and release after response? This is a common approach.

Spring RESTful service application architecture

Currently we are building web services applications with Spring, Hibernate, MySQL and tomcat. We are not using real application server- SoA architecture. Regarding the persistence layer - today we are using Hibernate with MySQL but after one year we may end up with MongoDB and Morphia.
The idea here is to create architecture of the system regardless concrete database engine or persistence layer and get maximum benefits.
Let me explain - https://s3.amazonaws.com/creately-published/gtp2dsmt1. We have two cases here:
Scenario one:
We have one database that is replicated (in the beginning no) and different applications. Each application represents on war that has it's one controllers, application context, servlet xml. Domain and persistence layer is imported as maven lib - there is one version for it that is included in each application.
Pros:
Small applications that are easy to maintain
Distributed solution - each application can be moved to it's own tomcat instance or different machine for example
Cons:
Possible problems when using hibernate session and sync of it between different applications. I don't know that is possible at all with that implementation.
Scenario two - one application that has internal logic to split and organize different services - News and User.
Pros:
One persistence layer - full featured of hibernate
More j2ee look with options to extend to next level- integrate EJB and move to application server
Cons:
One huge war application more efforts to maintain
Not distribute as in the first scenario
I like more the first scenario but I'm worried about Hibernate behavior in that case and all benefits that I can get from it.
I'll be very thankful for your opinion on that case.
Cheers

Possible problems when using hibernate session and sync of it between different applications. I don't know that is possible at all with that implementation.
There are a couple of solutions that solve this exact problem:
Terracotta
Take a look at Hibernate Distributed Cache Tutorial
Also there is a bit older slide share Scaling Hibernate with Terracotta that delivers the point in pictures
Infinispan
Take a look at Using Infinispan as JPA-Hibernate Second Level Cache Provider
Going with the first solution (distributed) may be the right way to go.
It all depends on what the business problem is
Of course distributed is cool and fault tolerant and, and,.. but RAM and disks are getting cheaper and cheaper, so "scaling up" (and having a couple hot hot replicas) is actually NOT all that bad => these are props to the the "second" approach you described.
But let's say you go with the approach #1. If you do that, you would benefit from switching to NoSQL in the future, since you now have replica sets / sharding, etc.. and actually several nodes to support the concept.
But.. is 100% consistency something that a must have? ( e.g. does the product has to do with money ). How big are you planning to become => are you ready to maintain hundreds of servers? Do you have complex aggregate queries that need to run faster than xteen hours?
These are the questions that, in addition to your understanding of the business, should help you land on #1 or #2.

So, this is very late answer for this but finally I'm ready to answer. I'll put some details here about further developing of the REST service application.
Finally I landed on solution #1 from tolitius's great answer with option to migrate to solution #2 on later stage.
This is the application architecture - I'll add graphics later.
Persistence layer - this holds domain model, all database operations. Generated from database model with Spring Roo, generated repository and service layer for easy migration later.
Business layer - here is located all the business logic necessary for the oprations. This layer depends on Persistence layer.
Presentation layer validation, controllers calling Business layer.
All of this is run on Tomcat without Application server extras. On later phase this can be moved to Application server and implement Service locator pattern fully.
Infrastructure - geo located servers with geo load balancer, MySQL replication ring between all of them and one backup server and one backup server in case of fail.
My idea was to make more modern system architecture but from my experience with Java technology this is a "normal risk" situation.
With more experience - more beautiful solutions :) Looking forward for this!

What's the best way to share business object instances between Java web apps using JBoss and Spring?

We currently have a web application loading a Spring application context which instantiates a stack of business objects, DAO objects and Hibernate. We would like to share this stack with another web application, to avoid having multiple instances of the same objects.
We have looked into several approaches; exposing the objects using JMX or JNDI, or using EJB3.
The different approaches all have their issues, and we are looking for a lightweight method.
Any suggestions on how to solve this?
Edit: I have received comments requesting me to elaborate a bit, so here goes:
The main problem we want to solve is that we want to have only one instance of Hibernate. This is due to problems with invalidation of Hibernate's 2nd level cache when running several client applications working with the same datasource. Also, the business/DAO/Hibernate stack is growing rather large, so not duplicating it just makes more sense.
First, we tried to look at how the business layer alone could be exposed to other web apps, and Spring offers JMX wrapping at the price of a tiny amount of XML. However, we were unable to bind the JMX entities to the JNDI tree, so we couldn't lookup the objects from the web apps.
Then we tried binding the business layer directly to JNDI. Although Spring didn't offer any method for this, using JNDITemplate to bind them was also trivial. But this led to several new problems: 1) Security manager denies access to RMI classloader, so the client failed once we tried to invoke methods on the JNDI resource. 2) Once the security issues were resolved, JBoss threw IllegalArgumentException: object is not an instance of declaring class. A bit of reading reveals that we need stub implementations for the JNDI resources, but this seems like a lot of hassle (perhaps Spring can help us?)
We haven't looked too much into EJB yet, but after the first two tries I'm wondering if what we're trying to achieve is at all possible.
To sum up what we're trying to achieve: One JBoss instance, several web apps utilizing one stack of business objects on top of DAO layer and Hibernate.
Best regards,
Nils

Are the web applications deployed on the same server?
I can't speak for Spring, but it is straightforward to move your business logic in to the EJB tier using Session Beans.
The application organization is straight forward. The Logic goes in to Session Beans, and these Session Beans are bundled within a single jar as an Java EE artifact with a ejb-jar.xml file (in EJB3, this will likely be practically empty).
Then bundle you Entity classes in to a seperate jar file.
Next, you will build each web app in to their own WAR file.
Finally, all of the jars and the wars are bundled in to a Java EE EAR, with the associated application.xml file (again, this will likely be quite minimal, simply enumerating the jars in the EAR).
This EAR is deployed wholesale to the app server.
Each WAR is effectively independent -- their own sessions, there own context paths, etc. But they share the common EJB back end, so you have only a single 2nd level cache.
You also use local references and calling semantic to talk to the EJBs since they're in the same server. No need for remote calls here.
I think this solves quite well the issue you're having, and its is quite straightforward in Java EE 5 with EJB 3.
Also, you can still use Spring for much of your work, as I understand, but I'm not a Spring person so I can not speak to the details.

What about spring parentContext?
Check out this article:
http://springtips.blogspot.com/2007/06/using-shared-parent-application-context.html

Terracotta might be a good fit here (disclosure: I am a developer for Terracotta). Terracotta transparently clusters Java objects at the JVM level, and integrates with both Spring and Hibernate. It is free and open source.
As you said, the problem of more than one client web app using an L2 cache is keeping those caches in synch. With Terracotta you can cluster a single Hibernate L2 cache. Each client node works with it's copy of that clustered cache, and Terracotta keeps it in synch. This link explains more.
As for your business objects, you can use Terracotta's Spring integration to cluster your beans - each web app can share clustered bean instances, and Terracotta keeps the clustered state in synch transparently.

Actually, if you want a lightweight solution and don't need transactions or clustering just use Spring support for RMI. It allows to expose Spring beans remotely using simple annotations in the latest versions. See http://static.springframework.org/spring/docs/2.0.x/reference/remoting.html.

You should take a look at the Terracotta Reference Web Application - Examinator. It has most of the components you are looking for - it's got Hibernate, JPA, and Spring with a MySQL backend.
It's been pre-tuned to scale up to 16 nodes, 20k concurrent users.
Check it out here: http://reference.terracotta.org/examinator

Thank you for your answers so far. We're still not quite there, but we have tried a few things now and see things more clearly. Here's a short update:
The solution which appears to be the most viable is EJB. However, this will require some amount of changes in our code, so we're not going to fully implement that solution right now. I'm almost surprised that we haven't been able to find some Spring feature to help us out here.
We have also tried the JNDI route, which ends with the need for stubs for all shared interfaces. This feels like a lot of hassle, considering that everything is on the same server anyway.
Yesterday, we had a small break through with JMX. Although JMX is definately not meant for this kind of use, we have proven that it can be done - with no code changes and a minimal amount of XML (a big Thank You to Spring for MBeanExporter and MBeanProxyFactoryBean). The major drawbacks to this method are performance and the fact that our domain classes must be shared through JBoss' server/lib folder. I.e., we have to remove some dependencies from our WARs and move them to server/lib, else we get ClassCastException when the business layer returns objects from our own domain model. I fully understand why this happens, but it is not ideal for what we're trying to achieve.
I thought it was time for a little update, because what appears to be the best solution will take some time to implement. I'll post our findings here once we've done that job.

Spring does have an integration point that might be of interest to you: EJB 3 injection nterceptor. This enables you to access spring beans from EJBs.

I'm not really sure what you are trying to solve; at the end of the day each jvm will either have replicated instances of the objects, or stubs representing objects existing on another (logical) server.
You could, setup a third 'business logic' server that has a remote api which your two web apps could call. The typical solution is to use EJB, but I think spring has remoting options built into its stack.
The other option is to use some form of shared cache architecture... which will synchronize object changes between the servers, but you still have two sets of instances.

Take a look at JBossCache. It allows you to easily share/replicate maps of data between mulitple JVM instances (same box or different). It is easy to use and has lots of wire level protocol options (TCP, UDP Multicast, etc.).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.