How are very large Java EE web sites structured?

How are very large Java EE web sites structured? - java

A best practice question - how are very large websites best structured with Java.
I'm interested in knowing how the deployments themselves are structured -
Some possible answers:
A single Ear - with/without session
sharing in between the constituent
wars?
Multiple Wars - with/without
session sharing?
Multiple modules
that are assembled into one big War
at deployment time?
Is there any documented best practice for this?

If you could find data, I'd bet you'd have examples of each one (and perhaps more besides).
I don't know that there's a "best practice" that's uniformly followed.
Your biggest concerns appear to be session sharing and deployment. Regardless of how it's done, I'd say that session data ought to be minimized, and sharing between WARs? No. One owner for data, please. Sharing suggests that you've spread functionality for a single use case across modules. This will lead to grief someday.
As far as packaging goes, I'd say that the bigger the package the more code is affected by changes. If you can partition something into two independent WARs, you can change one without bringing the other down. That's better for maintenance.

When you say "very large websites", I assume website that have too many modules and sub modules. Not the site with heavy traffic. If you are intrested in site with heavy traffic, please rephrase and post a different question or update this one.
Keep it modular with multiple EAR's/WAR's. Having it modular gives opportunity for each module to evolve independently of other unless there is change in module interface. If there is change in interface, dependent module needs to be updated as well. Although not a website, one of such example is Eclipse IDE. Each of it's modules are developed and maintained independently and has it's own versions. Having website modular also gives opportunity to deploy individual modules on separate machines/servers and hence can scale individually.
Session sharing across EAR's/WAR's will be considered bad idea. It is too much of a work for server. Further it can introduces bugs that will be hard to debug just like global variables. But having user to log in again and again for each module is also really bad. You would need to implement some "Single Sign On (SSO)" solution for this. As far as example is concerned, www.google.com, www.gmail.com, www.orkut.com etc. are all google services and each service is like an individual module. But if you sign into one of the service and then open another without signing out, you automatically get logged in.
Although modules modules that are assembled into one big War is a bad idea, Once a year you can have all modules being deployed together (not as single WAR but individually) and name it something. Similar thing can be seen with eclipse. Eclipse has time to time updates for each individual modules but one a year they have a major release where by all modules are upgraded (Europa, Ganymede, Galileo ...).
Every application is different and has different requirements. There is no specific best practices for large website as it would depend on the website being developed. Say for example session sharing won't be a good practice but business requirement may drive you to do so. Or some alternate method could be used to share information across modules.

Related

Is there any need to switch to modules when migrating to Java 9 or later?

We're currently migrating from Java 8 to Java 11. However, upgrading our services was less painful, than we anticipated. We basically only had to change the version number in our build.gradle file and the services were happily up and running. We upgraded libraries as well as (micro) services that use those libs. No problems until now.
Is there any need to actually switch to modules? This would generate needless costs IMHO. Any suggestion or further reading material is appreciated.
To clarify, are there any consequences if Java 9+ code is used without introducing modules? E.g. can it become incompatible with other code?

No.
There is no need to switch to modules.
There has never been a need to switch to modules.
Java 9 and later releases support traditional JAR files on the
traditional class path, via the concept of the unnamed module, and will
likely do so until the heat death of the universe.
Whether to start using modules is entirely up to you.
If you maintain a large legacy project that isn’t changing very much,
then it’s probably not worth the effort.
If you work on a large project that’s grown difficult to maintain over
the years then the clarity and discipline that modularization brings
could be beneficial, but it could also be a lot of work, so think
carefully before you begin.
If you’re starting a new project then I highly recommend starting with
modules if you can. Many popular libraries have, by now, been upgraded
to be modules, so there’s a good
chance that all of the dependencies that you need are already available
in modular form.
If you maintain a library then I strongly recommend that you
upgrade it to be a module if you haven’t done so already, and if all of
your library’s dependencies have been converted.
All this isn’t to say that you won’t encounter a few stumbling blocks
when moving past Java 8. Those that you do encounter will, however,
likely have nothing to do with modules per se. The most common
migration problems that we’ve heard about since we released Java 9 in
2017 have to do with changes to the syntax of the version
string and to the removal or
encapsulation of internal APIs
(e.g., sun.misc.Base64Decoder) for which public, supported
replacements have been available for years.

I can only tell you my organization opinion on the matter. We are in the process of moving to modules, for every single project that we are working on. What we are building is basically micro-services + some client libraries. For micro-services the transition to modules is somehow a lower priority: the code there is already somehow isolated in the docker container, so "adding" modules in there does not seem (to us) very important. This work is being picked up slowly, but it's low priority.
On the other hand, client libraries is an entirely different story. I can not tell you the mess we have sometimes. I'll explain one point that I hated before jigsaw. You expose an interface to clients, for everyone to use. Automatically that interface is public - exposed to the world. Usually, what I do, is have then some package-private classes, that are not exposed to the clients, that use that interface. I don't want clients to use that, it is internal. Sounds good? Wrong.
The first problem is that when those package-private classes grow, and you want more classes, the only way to keep everything hidden is to create classes in the same package:
package abc:
-- /* non-public */ Usage.java
-- /* non-public */ HelperUsage.java
-- /* non-public */ FactoryUsage.java
....
When it grows (in our cases it does), those packages are way too big. Moving to a separate package you say? Sure, but then that HelperUsage and FactoryUsage will be public and we tried to avoid that from the beginning.
Problem number two: any user/caller of our clients can create the same package name and extend those hidden classes. It happened a few times to us already, fun times.
modules solves this problem in a beautiful way : public is not really public anymore; I can have friend access via exports to directive. This makes our code lifecycle and management much easier. And we get away from classpath hell. Of course maven/gradle handle that for us, mainly, but when there is a problem, the pain will be very real. There could be many other examples, too.
That said, transition is (still) not easy. First of all, everyone on the team needs to be aligned; second there are hurdles. The biggest two I still see is: how do you separate each module, based on what, specifically? I don't have a definite answer, yet. The second is split-packages, oh the beautiful "same class is exported by different modules". If this happens with your libraries, there are ways to mitigate; but if these are external libraries... not that easy.
If you depend on jarA and jarB (separate modules), but they both export abc.def.Util, you are in for a surprise. There are ways to solve this, though. Somehow painful, but solvable.
Overall, since we migrated to modules (and still do), our code has become much cleaner. And if your company is "code-first" company, this matters. On the other hand, I have been involved in companies were this was seen as "too expensive", "no real benefit" by senior architects.

What is wrong with sharing versioned libraries between microservices?

Why is it bad practice to share libraries between microservices? Let's say that I want to share a domain model between two microservices (they have the same bounded context, the original microservice was simply split into two smaller ones due to its size). What is wrong with this approach? Changing the domain model won't break anything, as the consumers of the library used a specific version of it?

There is no problem if two microservices share a 3rd party library, and in fact this happens all the time. Many use the same service framework, logging framework, common libs from apache, google, etc.
The problem happens when microservice teams share a 3rd party library that they can modify to suit their own purposes, because if the can, then they eventually will. Requirements from many services will end up getting pushed down into the library, its purpose will get confused and difficult to state. It's code will bloat.
Service teams will then regularly have to modify the library in the course of their everyday business. Because the library serves many masters, then, they will have to do it very carefully... talk to stakeholders... make sure they don't break anyone else's stuff... Sheesh! The library becomes like a little monolith.
If you share that one library, you'll share others too. Eventually they'll all be little monoliths and your whole architecture will be a monolith made even more annoying by splitting it into several repositories.
-- (added for comments):
Now, you suggest that this problem doesn't happen as long as microservices are depending on a specific version of the library. This doesn't solve the problem, though. It just moves the work around.
Lets say that you depend on a specific version of the library, and 6 teams have made their own modifications to the library since that version. Following your advice, none of teams bothered to talk to your team about it, so now it's a mess and you have a choice:
Spend all the time required to fix any problems that their changes might have caused in your service (and don't bother to talk to them about it, so they'll have to do the same thing on their next upgrade), and then upgrade; or
Just fork the library to get rid of all their changes and fix just your own problems.
Choice 2 is the right choice, but almost nobody does this! Because all services share the same library, people think that it's some kind of business rule that they share the same library.
Since people are very reluctant to fork the library after the pattern of sharing it is established, it's better to fork it at the start, i.e, just don't share it around in the first place.

Is it better to have separate projects for Storefront and backend or single?

I am running an ecommerce website built in Java using Spring and Hibernate. If I have to briefly describe current architecture then it is like this:
Two projects - store front and admin
Storefront project holds dao, model, service, controller and views for showing the storefront view of website and also hold APIs for apps
Admin project holds dao, model, service, controller and views for showing the backend/admin interfaces for managing this ecommerce store.
Storefront and admin both independently talk with common MySQL database and whenever any communication is required between these two projects, they do that using REST APIs.
I Followed this architecture to develop both projects independently, keep them light, and deploy them independently.
But I am not really sure that this is the correct way of doing things. Major problems that I frequently face are:
It generally causes code duplicacy, mostly in Models, as it is something both projects have but most properties in them are common.
If any changes in database required, then I have to make sure those changes are properly made in both projects as both are making independent DB calls.
Please suggest what could be the best approach for handling and architecting such project. Website stats are:
Around 1 million products
Monthly traffic around 1 million
Per day orders around 1000
And we are looking for more growth in traffic and orders, in next 9 months. So please suggest keeping future scalability in mind.

Shared Common Model: With regards to common model shared across the applications that may grow & prone to frequent changes, you can make the model a separate project having its pom dependency in each of the projects. Even better is to keep that project with json schema definition and have maven plugin (assuming you are using maven) to convert them to POJOs during the build and add to your project classpath. Please refer jsonschema2pojo.
Website stats/traffic growth: For this please consider load-balance clustering (if you have not done) your web/application server adding multiple nodes. This will help in high availability of your application and handle heavy traffic.
There may be other better solutions, but these points are worth considering. I have some thoughts on your other points but I feel there could be better solutions with others who have better knowledge of architecture.
Hope these thoughts help.

Radical modularity in Java [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
The development team I'm a part of wrote and still maintains a codebase of pure Java spaghetti code. Most of this was implemented before I joined this team.
My background is in Python/Django development and the Django community really emphasizes "pluggability" -- a feature of Django "apps" (modules) which implies mostly "one thing and one thing well", re-useability, loose coupling, and a clean, conscious API. Once the Django team I once worked in started to "get it", we pretty much had zero problems with messy, tightly-coupled masses of monolithic code.
Our practice involved developing Django apps outside of the Django project in which we intended to use the app (a common practice among Django developers, from what I've gathered) Each even lived in a Git repository separate from that of the overall Django project.
Now, we're presented with an opportunity to refactor much of our spaghetti code, and it seems to me that some of what I learned from the Django community should be applied here. In short, I'd like to see our new codebase developed as a series of "pluggable" modules each written under the assumption that it won't have access to other modules (except those on which it should rationally depend). I believe this should do a good job of driving home principles of proper software design to the whole team.
So, what I'm advocating for is to have one separate repository per "feature" we want in our new (Spring) project. Each would have its own, independent build process and the result would be a .jar . We'd also have a repository for project-level things (JSP's, static files, etc) and its build process would produce a .war . The .jars wouldn't be placed inside the .war, but rather be treated as Gradle dependencies (the same way third-party dependencies would be.)
Now I have to sell it to the boss. He's asked for some example of precedent for this plan. Obvious places to look would be open-source projects, but if a project is split across multiple repositories, it's likely to be multiple projects. So, perhaps I'm looking for some sort of suite. Spring itself looks promising as an example, but I haven't been able to find many others.
My questions are (and sorry for the long back-story):
Is there any such precedent?
What examples are there?
Is there any documentation (even just a blog post would be of help) out there advocating anything like this?
Any suggestions for implementing this?
Is this even a good idea?
Thanks in advance!
Edit: Whether or not to refactor is not in question. We have already decided to make some drastic changes to most of our code -- not primarily for the purpose of "making it cleaner" in fact. My question is about whether our planned project structure is sound and how to justify it to the decision-makers.

The following issues are more important that where to put code on the disk or in an artifact:
If you don't understand that, you have already failed.
What you describe is not refactoring, it is rewriting using a more palatable name:
Unless you have 100% code covered in unit tests already; someone(s) are going to get fired over this when ( not if ) this effort fails spectacularly, probably multiple times!
Even with awesome unit tests, someone is going to miss something and someone is going to take the fall when it finally gets discovered in production, usually after months of silently corrupting data.
Semantics are Important:
Removing Struts and replacing with Spring is not refactoring is rewriting by definition. Refactoring would be moving from Struts 1.1 to 2.0, replacing Struts means replacing all the Struts code with something else, by definition that is rewriting not refactoring.
Working Software comes in many disguises:
The business always thinks what they have is working.
Miss a deadline, introduce a bug no matter how minor, lose a feature no matter how minor, mis-interpret an undocumented process and change something no matter how minor or even for the better. They are just looking for any problems weakness or potential trouble to spread FUD and make sure your effort is a failure, at least most of the time.
All these things will cost you and your team political capital, someone will take the fall for these things, no matter how innocent or merit-less the perceived failures are!
Projected outcome:
Most likely you and/or other "non-Java" developers and not the core Java people that created this working system that you "non-Java" people refactored ( code word for rewrite in this case ) and delivered on time broken and incomplete or didn't deliver on time or didn't deliver at all.

I think it's possible to find examples of good and bad implementation in both Java and Python.
Spring is indeed a good place to start. It emphasizes good layering, designing to interfaces, etc. I'd recommend it highly.
Spring's layering is "horizontal". If "module" means "business functionality" to you, then multiple modules will be present in a given layer.
Plugable suggests a more vertical approach.
Maybe another way to attack it would be to decompose the system into a set of independent REST web services. Decouple the functionality from the interface completely. That way the services can be shared by web, mobile, or any other client that comes along.
This will require strict isolation of services and ownership of data sources. A single web service will own/manage a swath of data.
I'd recommend looking at Michael Feathers' book "Working Effectively With Legacy Code". It's ten years old, but still highly regarded on Amazon.
One thing I like about the REST web services approach is you can isolate features and do them as time and budget permit. Create and test a service, let clients exercise it, move on to the rest. If things aren't too coupled you can march through the app that way.

Refactoring entire application is not an easy task, specially if the code is hard-wired as you have described. If Pluggablity is an important target, I would recommend Grails/Groovy.
If you can somehow identify view, controller and business logic, or much better services. You might be able to reuse some existing codes. The good thing about grails/groovy is that you are able to incorporate JSP/JAVA with GSP/GROOVY.
Again, this is really hard to sell and grails is probably a good framework to ease the refactoring pain.

I would recommend watching "Four Strategies for Dealing with Legacy Code" by Eric Evans, the originator of Domain-Driven Design: http://dddcommunity.org/library/evans_2011_2/
It may not entirely answer your question but it offers arguments and strategies that might help with your endeavour.

Every component is in its own jar, but the jars are not contained in the war ? How should that work ? Even third party libs are included in the war. Except for those that are provided by the container. Regarding the jsp stuff: Can I serve JSPs from inside a JAR in lib, or is there a workaround?

Why use many sub-projects and dependencies over packages?

I’ve mostly in my career worked in small or mid-sized java projects. I recently saw a huge project comprising of 30 projects in eclipse. I don’t really get concept of creating many small projects and then maintain inter-project dependencies. When do we prefer this over simply organizing stuff in packages?
I guessed it’s a maven thing (have mostly been using Ant). I’ve been reading up on Maven’s concept of modules as well – I saw some links on net recommending creation of different modules for web, dao and service layers under a parent module. Is it really a common/best practice?
With or without maven – does such division really makes life easier? Isn’t it more compact to have everything in a single project with well-defined packages structure for different layers?

It's common to split up projects into API, implementation, web, etc. components–when there's a need to do so. Large projects are just that: large.
There are benefits to keeping components separate"
Re-use functionality (e.g., the web layer uses the service layer
Package individual components (e.g., ship jus the API to a client)
Version sub-components; define their version dependencies
You can do all the same stuff with one giant project, but it's more difficult to determine what goes where, and why. Life is easier when those lines of demarcation are clearly defined.
How much easier depends on the project, but when you're dealing with hundreds of thousands of lines of code, occasionally millions, breaking that stuff up saves huge headaches.

Why choose to create a separate module in maven? To help you with your development. There really is no other reason.
The are a number of different reasons why you may want to create a separate module:
Separation of concerns: yes, you can do this with packages, but if it's in a separate module, then it can be compiled separately, and you can reduce the amount of tangle[*] in your packages.
The modules are managed by different teams, with release cycles of their own.
More understandable code: If all of your dao code is in one module, and all of your web in another, you can test them separately.
A module can be a separate deployable entity. I have a project which has two web apps, 5 batches and two other core modules (one core for the webapp and one core for the batches). I can now build and deploy each module separately.
The modules are published and used externally. If this is true, then you want the minimum amount of 'other' code in this module.
You choose to break up into modules for the same reasons as you would for separating into packages, but at a higher level, at a group of packages.
30 does seem excessive. But there may be good reasons for it. It's up to you and your project to decide what is the right level for the number of modules.
Personally, I try not to split excessively, unless there is a very good reason to do so.
[*] Tangle: mess which describes the links between packages. Package A uses B, which uses C, which uses both A and B. Something which doesn't help understanding.

I think excessive modularisation is akin to over engineering. The best route in my opinion is to start with one module/project and keep with that until such point as it becomes obvious to everyone involved that part of this existing module would benefit from being extracted into its own module. Yes, that means extra work at that point, but for me I'd rather do the work then than endlessly battle with unneeded complexity, in terms of my builds and my development environment, for benefits which are never actually realised.
Unfortunately there seems to be a tendency at the start of a projects to modularise to the nth degree before even a single line of code is written.

there is another advantage to this.. if i have to deploy on 'selected' components i dont waste time and resources deploying dependencies that I dont need.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.