What is wrong with sharing versioned libraries between microservices? - java

Why is it bad practice to share libraries between microservices? Let's say that I want to share a domain model between two microservices (they have the same bounded context, the original microservice was simply split into two smaller ones due to its size). What is wrong with this approach? Changing the domain model won't break anything, as the consumers of the library used a specific version of it?

There is no problem if two microservices share a 3rd party library, and in fact this happens all the time. Many use the same service framework, logging framework, common libs from apache, google, etc.
The problem happens when microservice teams share a 3rd party library that they can modify to suit their own purposes, because if the can, then they eventually will. Requirements from many services will end up getting pushed down into the library, its purpose will get confused and difficult to state. It's code will bloat.
Service teams will then regularly have to modify the library in the course of their everyday business. Because the library serves many masters, then, they will have to do it very carefully... talk to stakeholders... make sure they don't break anyone else's stuff... Sheesh! The library becomes like a little monolith.
If you share that one library, you'll share others too. Eventually they'll all be little monoliths and your whole architecture will be a monolith made even more annoying by splitting it into several repositories.
-- (added for comments):
Now, you suggest that this problem doesn't happen as long as microservices are depending on a specific version of the library. This doesn't solve the problem, though. It just moves the work around.
Lets say that you depend on a specific version of the library, and 6 teams have made their own modifications to the library since that version. Following your advice, none of teams bothered to talk to your team about it, so now it's a mess and you have a choice:
Spend all the time required to fix any problems that their changes might have caused in your service (and don't bother to talk to them about it, so they'll have to do the same thing on their next upgrade), and then upgrade; or
Just fork the library to get rid of all their changes and fix just your own problems.
Choice 2 is the right choice, but almost nobody does this! Because all services share the same library, people think that it's some kind of business rule that they share the same library.
Since people are very reluctant to fork the library after the pattern of sharing it is established, it's better to fork it at the start, i.e, just don't share it around in the first place.

Related

Is there any need to switch to modules when migrating to Java 9 or later?

We're currently migrating from Java 8 to Java 11. However, upgrading our services was less painful, than we anticipated. We basically only had to change the version number in our build.gradle file and the services were happily up and running. We upgraded libraries as well as (micro) services that use those libs. No problems until now.
Is there any need to actually switch to modules? This would generate needless costs IMHO. Any suggestion or further reading material is appreciated.
To clarify, are there any consequences if Java 9+ code is used without introducing modules? E.g. can it become incompatible with other code?
No.
There is no need to switch to modules.
There has never been a need to switch to modules.
Java 9 and later releases support traditional JAR files on the
traditional class path, via the concept of the unnamed module, and will
likely do so until the heat death of the universe.
Whether to start using modules is entirely up to you.
If you maintain a large legacy project that isn’t changing very much,
then it’s probably not worth the effort.
If you work on a large project that’s grown difficult to maintain over
the years then the clarity and discipline that modularization brings
could be beneficial, but it could also be a lot of work, so think
carefully before you begin.
If you’re starting a new project then I highly recommend starting with
modules if you can. Many popular libraries have, by now, been upgraded
to be modules, so there’s a good
chance that all of the dependencies that you need are already available
in modular form.
If you maintain a library then I strongly recommend that you
upgrade it to be a module if you haven’t done so already, and if all of
your library’s dependencies have been converted.
All this isn’t to say that you won’t encounter a few stumbling blocks
when moving past Java 8. Those that you do encounter will, however,
likely have nothing to do with modules per se. The most common
migration problems that we’ve heard about since we released Java 9 in
2017 have to do with changes to the syntax of the version
string and to the removal or
encapsulation of internal APIs
(e.g., sun.misc.Base64Decoder) for which public, supported
replacements have been available for years.
I can only tell you my organization opinion on the matter. We are in the process of moving to modules, for every single project that we are working on. What we are building is basically micro-services + some client libraries. For micro-services the transition to modules is somehow a lower priority: the code there is already somehow isolated in the docker container, so "adding" modules in there does not seem (to us) very important. This work is being picked up slowly, but it's low priority.
On the other hand, client libraries is an entirely different story. I can not tell you the mess we have sometimes. I'll explain one point that I hated before jigsaw. You expose an interface to clients, for everyone to use. Automatically that interface is public - exposed to the world. Usually, what I do, is have then some package-private classes, that are not exposed to the clients, that use that interface. I don't want clients to use that, it is internal. Sounds good? Wrong.
The first problem is that when those package-private classes grow, and you want more classes, the only way to keep everything hidden is to create classes in the same package:
package abc:
-- /* non-public */ Usage.java
-- /* non-public */ HelperUsage.java
-- /* non-public */ FactoryUsage.java
....
When it grows (in our cases it does), those packages are way too big. Moving to a separate package you say? Sure, but then that HelperUsage and FactoryUsage will be public and we tried to avoid that from the beginning.
Problem number two: any user/caller of our clients can create the same package name and extend those hidden classes. It happened a few times to us already, fun times.
modules solves this problem in a beautiful way : public is not really public anymore; I can have friend access via exports to directive. This makes our code lifecycle and management much easier. And we get away from classpath hell. Of course maven/gradle handle that for us, mainly, but when there is a problem, the pain will be very real. There could be many other examples, too.
That said, transition is (still) not easy. First of all, everyone on the team needs to be aligned; second there are hurdles. The biggest two I still see is: how do you separate each module, based on what, specifically? I don't have a definite answer, yet. The second is split-packages, oh the beautiful "same class is exported by different modules". If this happens with your libraries, there are ways to mitigate; but if these are external libraries... not that easy.
If you depend on jarA and jarB (separate modules), but they both export abc.def.Util, you are in for a surprise. There are ways to solve this, though. Somehow painful, but solvable.
Overall, since we migrated to modules (and still do), our code has become much cleaner. And if your company is "code-first" company, this matters. On the other hand, I have been involved in companies were this was seen as "too expensive", "no real benefit" by senior architects.

Radical modularity in Java [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
The development team I'm a part of wrote and still maintains a codebase of pure Java spaghetti code. Most of this was implemented before I joined this team.
My background is in Python/Django development and the Django community really emphasizes "pluggability" -- a feature of Django "apps" (modules) which implies mostly "one thing and one thing well", re-useability, loose coupling, and a clean, conscious API. Once the Django team I once worked in started to "get it", we pretty much had zero problems with messy, tightly-coupled masses of monolithic code.
Our practice involved developing Django apps outside of the Django project in which we intended to use the app (a common practice among Django developers, from what I've gathered) Each even lived in a Git repository separate from that of the overall Django project.
Now, we're presented with an opportunity to refactor much of our spaghetti code, and it seems to me that some of what I learned from the Django community should be applied here. In short, I'd like to see our new codebase developed as a series of "pluggable" modules each written under the assumption that it won't have access to other modules (except those on which it should rationally depend). I believe this should do a good job of driving home principles of proper software design to the whole team.
So, what I'm advocating for is to have one separate repository per "feature" we want in our new (Spring) project. Each would have its own, independent build process and the result would be a .jar . We'd also have a repository for project-level things (JSP's, static files, etc) and its build process would produce a .war . The .jars wouldn't be placed inside the .war, but rather be treated as Gradle dependencies (the same way third-party dependencies would be.)
Now I have to sell it to the boss. He's asked for some example of precedent for this plan. Obvious places to look would be open-source projects, but if a project is split across multiple repositories, it's likely to be multiple projects. So, perhaps I'm looking for some sort of suite. Spring itself looks promising as an example, but I haven't been able to find many others.
My questions are (and sorry for the long back-story):
Is there any such precedent?
What examples are there?
Is there any documentation (even just a blog post would be of help) out there advocating anything like this?
Any suggestions for implementing this?
Is this even a good idea?
Thanks in advance!
Edit: Whether or not to refactor is not in question. We have already decided to make some drastic changes to most of our code -- not primarily for the purpose of "making it cleaner" in fact. My question is about whether our planned project structure is sound and how to justify it to the decision-makers.
The following issues are more important that where to put code on the disk or in an artifact:
If you don't understand that, you have already failed.
What you describe is not refactoring, it is rewriting using a more palatable name:
Unless you have 100% code covered in unit tests already; someone(s) are going to get fired over this when ( not if ) this effort fails spectacularly, probably multiple times!
Even with awesome unit tests, someone is going to miss something and someone is going to take the fall when it finally gets discovered in production, usually after months of silently corrupting data.
Semantics are Important:
Removing Struts and replacing with Spring is not refactoring is rewriting by definition. Refactoring would be moving from Struts 1.1 to 2.0, replacing Struts means replacing all the Struts code with something else, by definition that is rewriting not refactoring.
Working Software comes in many disguises:
The business always thinks what they have is working.
Miss a deadline, introduce a bug no matter how minor, lose a feature no matter how minor, mis-interpret an undocumented process and change something no matter how minor or even for the better. They are just looking for any problems weakness or potential trouble to spread FUD and make sure your effort is a failure, at least most of the time.
All these things will cost you and your team political capital, someone will take the fall for these things, no matter how innocent or merit-less the perceived failures are!
Projected outcome:
Most likely you and/or other "non-Java" developers and not the core Java people that created this working system that you "non-Java" people refactored ( code word for rewrite in this case ) and delivered on time broken and incomplete or didn't deliver on time or didn't deliver at all.
I think it's possible to find examples of good and bad implementation in both Java and Python.
Spring is indeed a good place to start. It emphasizes good layering, designing to interfaces, etc. I'd recommend it highly.
Spring's layering is "horizontal". If "module" means "business functionality" to you, then multiple modules will be present in a given layer.
Plugable suggests a more vertical approach.
Maybe another way to attack it would be to decompose the system into a set of independent REST web services. Decouple the functionality from the interface completely. That way the services can be shared by web, mobile, or any other client that comes along.
This will require strict isolation of services and ownership of data sources. A single web service will own/manage a swath of data.
I'd recommend looking at Michael Feathers' book "Working Effectively With Legacy Code". It's ten years old, but still highly regarded on Amazon.
One thing I like about the REST web services approach is you can isolate features and do them as time and budget permit. Create and test a service, let clients exercise it, move on to the rest. If things aren't too coupled you can march through the app that way.
Refactoring entire application is not an easy task, specially if the code is hard-wired as you have described. If Pluggablity is an important target, I would recommend Grails/Groovy.
If you can somehow identify view, controller and business logic, or much better services. You might be able to reuse some existing codes. The good thing about grails/groovy is that you are able to incorporate JSP/JAVA with GSP/GROOVY.
Again, this is really hard to sell and grails is probably a good framework to ease the refactoring pain.
I would recommend watching "Four Strategies for Dealing with Legacy Code" by Eric Evans, the originator of Domain-Driven Design: http://dddcommunity.org/library/evans_2011_2/
It may not entirely answer your question but it offers arguments and strategies that might help with your endeavour.
Every component is in its own jar, but the jars are not contained in the war ? How should that work ? Even third party libs are included in the war. Except for those that are provided by the container. Regarding the jsp stuff: Can I serve JSPs from inside a JAR in lib, or is there a workaround?

Ripping out Hibernate/Mysql for MongoDB or Couch for a Java/Spring/Tomcat web application

I have an application that is undergoing massive rework, and I've been exploring different options - chug along 'as is', redo the project in a different framework or platform, etc.
When I really think about it, here are 3 major things I really dislike about java:
Server start/stops when modifying controllers or other classes. Dynamic languages are a huge win over Java here.
Hibernate, Lazyloading exceptions (especially those that occur in asynchronous service calls or during Jackson JSON marshalling) and ORM bloat in general. Hibernate, all by itself, is responsible for slow integration start up times and insanely slow application start up times.
Java stupidity - inconsistent class-loading problems when running your app inside of your IDE compared to Tomcat. Granted once you iron out these issues, you most likely won't see them again. Even still, most of these are actually caused by Hibernate since it insists on a specific Antlr version and so on.
After thinking about the problem... I could solve or at least improve the situation in all 3 of these areas if I just got rid of Hibernate.
Have any of you reworked a 50+ entity java application to use mongo or couch or similar database? What was the experience like? Do you recommend it? How long did it take you assuming you have some pretty great unit/integration tests? Does the idea sound better than it really is?
My application would actually benefit in many areas if I could store documents. It would actually open up some very cool and interesting features for this application. However, I do like being able to create dynamic queries for complex searches... and I'm told that Couch can't do those.
I'm really green when it comes to NoSQL databases, so any advice on migrating (or not migrating) a big java/spring project would be really helpful. Also, if this is a good idea, what books would you recommend I pick up to get me up to speed and really make use of them for this application in the best way possible?
Thanks
In any way, your rant doesn't just cover problems with the previously made (legacy) decision for Hibernate but also with your development as a programmer in general.
This is how I would do it, should a similar project be dropped in my lap and in dire need of refactoring or improvement.
It depends on the stage in your software's lifetime and the time pressure involved if you should make big changes or stick with smaller ones. Nevertheless, migrating in increments seems to be your best option in the long term.
Keeping the application written in Java for the short term seems wise, a major rewrite in another language will definitely break acceptance and integration tests.
Like suggested by Joseph, make the step from Hibernate to JPA. It shouldn't cost too much time. And from there you can switch the back-end to some other way of storage. Work towards a way of seperating concerns. Pick whatever concept seems best, some prefer MVC while others might opt for CQRS and still others adore another style of segmentation/seperation.
Since the JVM supports many languages, you can always switch to any of those or at least partially implement functionality in more dynamic languages. This will solve part of the problem where you keep bumping into the "stupidity" of Java, while still retaining the excellent optimizations of current JVMs at runtime.
In addition, you might want to set up automatic integration tests... since the application will hopefully never be run from your IDE, these tests will give you honest results.
Side note: I never trust my IDE to get dependencies right if the IDE has capabilities to inject its own libraries into my build or runtime path.
So to recap in short: small steps; lose Hibernate and go more abstract to JPA; if Java becomes stupid, then gradually switch to a clever language. Your primary concern should be to restructure the code base without losing functionality, keeping in mind to have an open design which will make adding interesting and cool features easier later on.
Well, much depends on things like "what exactly are the pain points with Hibernate?" (I know, you gave three examples...)
But those aren't core issues over the long haul. What you're running into is the nature of a compiled language vs. a dynamic one; at runtime, it works out better for you (as Java is faster and more scalable than the dynamic languages, based on my not-quite-exhaustive tests), but at development time, it's less amenable to just hacking crap together and hoping it works.
NoSQL isn't going to fix things, although document stores could, but there's a migration step you're going to have to go through.
Important: I work for a vendor in this space, which explains my experience in the area, as well as the bias in the next paragraph:
You're focusing on open source projects, I suppose, although what I would suggest is using a commercial product: GigaSpaces (http://gigaspaces.com). There's a community edition, that would allow you to migrate JPA-based java objects to a document model (via the SpaceDynamicProperties annotation); you could use JPA for the code you've written and slowly migrate to a fully document-oriented model at your convenience, plus complex queries aren't an issue.
All of those points are usually causing problems due to incompetence, rather than hibernate or java being problematic:
apart from structural modifications (adding fields or methods), all changes in the java code are hot-swapped in debug mode, so that you can save & test (without any redeploy).
the LazyInitializationException is a problem for hibernate-beginners only. There are many and clear solutions to it, and you'll find them with a simple google or SO search. And you can always set your collections to fetch=FetchType.EAGER. Or you can use Hibernate.initialize(..) to initialize lazy collections.
It is entirely normal for a library to require a specific version of another library (the opposite would be suspicious and wrong). If you keep your classpath clean (for example by using maven or ivy), you won't have any classloading issues. I have never had.
Now, I will provide an alternative. spring-data is a new portfolio project by springsource, that allows you to use your entities for a bunch of NoSQL stores.

Can OSGi help reduce complexity?

I saw lots of presentations on OSGi and i think it sounds promising for enforcing better modularization. Apparently "hotdeployment" and "running different versions of x in parallel" are mayor selling points too.
I wonder whether what OSGi promises to solve is even an issue...? It reminded me of the early days of OO when similar claims were maid:
When OO was new, the big argument was reusability. It was widely claimed that when using OO, one would only have to "write once" and could then "use everywhere".
In practice I only saw this working for some pretty low level examples. I think the reason for this is that writing reusable code is hard. Not technically but from a interface design point of view. You have to anticipate how future clients will want to use your classes and take the right choices up front. This is difficult by definition and thus the potential reusability benefit often failed to deliver.
With OSGi, I have the suspicion that here again we could fall for promises, potential solutions for problems that we don't really have. Or if we have them, we don't have them in a big enough quantity and severity that would justify to buy into OSGi for help. "Hotdeployment" for example of a subset of modules is definitely a great idea, but how often does it really work? How often not because it turned out you got the modularization wrong for the particular issue? How about model entities that are shared between multiple modules? Do these modules all have to be changed at the same time? Or do you flatten your objects to primitives and use only those in inter-module communication, in order to be able to keep interface contracts?
The hardest problem when applying OSGi is, I would presume, to get the modularization "right". Similar to getting the interfaces of your classes right in OO, with OSGi, the problem stays the same, on a bigger scale this time, the package or even service level.
As you might have guessed, I'm currently trying to evaluate OSGi for use in a project. The major problem we have, is increasing complexity as the codebase grows and I would like to break the system up in smaller modules that have less and more defined interactions.
Given no framework can ever help deciding what to modularize, has OSGi ever payed off for you?
Has it made your life easier when working in teams?
Has it helped to reduce bug count?
Do you ever successfully "hotdeploy" major components?
Does OSGi help to reduce complexity over time?
Did OSGi keep its promises?
Did it fulfill your expectations?
Thanks!
OSGi pays off because it enforces modularization at runtime, something you previously did not have, often causing the design on paper and implementation to drift apart. This can be a big win during development.
It definitely helps make it easier to work in teams, if you let teams focus on a single module (possibly a set of bundles), and if you get your modularization right. One could argue that one can do the same thing with a build tool like Ant+Ivy or Maven and dependencies, the granularity of dependencies that OSGi uses is in my opinion far superior, not causing the typical "dragging in everything plus the kitchen sink" that JAR level dependencies cause.
Modular code with less dependencies tends to lead to cleaner and less code, in turn leading to less bugs that are easier to test for and solve. It also promotes designing components as simple and straightforward as possible, whilst at the same time having the option to plug in more complicated implementations, or adding aspects such as caching as separate components.
Hot deployment, even if you do not use it at runtime, is a very good test to validate if you modularized your application correctly. If you cannot start your bundles in a random order at all, you should investigate why. Also, it can make your development cycle a lot quicker if you can update an arbitrary bundle.
As long as you can manage your modules and dependencies, big projects stay manageable and can be easily evolved (saving you from the arguably bad "complete rewrite").
The downside of OSGi? It's a very low-level framework, and whilst it solves the problems it is intended for quite well, there are things that you still need to resolve yourself. Especially if you come from a Java EE environment, where you get free thread-safety and some other concepts that can be quite useful if you need them, you need to come up with solutions for these in OSGi yourself.
A common pitfall is to not use abstractions on top of OSGi to make this easier for the average developer. Never ever let them mess with ServiceListeners or ServiceTrackers manually. Carefully consider what bundles are and are not allowed to do: Are you willing to give developers access to the BundleContext or do you hide all of this from them by using some form of declarative model.
I've worked with OSGi for some years now (although in the context of an eclipse project, not in a web project). It is clear that the framework does not free you from thinking how to modularize. But it enables you to define the rules.
If you use packages and defines (In a design document? Verbal?) that certain packages may not access classes in other packages, without an enforcement of this constraint, it will be broken. If you hire new developers they don't know the rules. They WILL break the rules. With OSGi you can define the rules in code. For us this was a big win, as it has helped us to maintain the architecture of our system.
OSGi does not reduce complexity. But it definitely helps to handle it.
I am using OSGI for over 8 years now, and every time I dive in a non-OSGI project I get the feeling over overspeeding without a seatbelt on.
OSGI makes project setup and deployment harder, and forces you to think about modularization upfront, but gives you the easy of mind of enforcing the rules at runtime.
Take maven apache camel as an example. When you create a new maven project and add apache camel as a dependency, the applications seems to have all its dependencies, and you will only notice the ClassNotFoundExceptions at runtime, which is bad. When you run in an OSGI container and load the apache camel modules, the modules with unmet dependencies are not started, and you know upfront what the problem is.
I also use the hot-deployment all the time, and update parts of my application on the fly without the need for a restart.
I used OSGI in one project (I admit - not very much). It provides good promises, but as #Arne said, you still need to think on your own about how you modularize.
OSGI did help our project because it made the architecture more stable. Breaking the modularization is more "difficult", so decisions that we made regarding how to modularize stayed valid for a longer time.
To put it differently - without OSGI, and under time pressure to deliver, sometimes you or your team members make compromises, shortcuts and other hacks, and the the original intent of the architecture is lost.
So OSGI didn't reduce the complexity per se, but it protected it from growing unnecessarily over time. I guess that is a good thing :)
I haven't used the hot deploy feature, so I can't comment about that.
To answer your last point, it did meet my expectations, but it required a learning curve and some adaption, and the payoff is only for long-term.
(as a side note, your question reminds me a bit of the adage that "maven is the awt of build systems")
OSGi does NOT pay off. The fact is OSGi is not easy to use and at the end of the day or year depending on how long it takes you to get things working, it does not add value:
Your application will not be more modular overall, on the contrary, It ends being more exposed and not isolated from other applications since it is a share everything instead of share nothing arch.
Versioning is pushed further down the stack, you wrestle with maven transitive dependencies only to do that again at runtime in OSGI.
Most libraries are designed to work as libraries in the application classloader not as bundles with their own classloader.
Maybe appropriate for plugin architectures where third party developers need to be sandboxed or maybe it is just EJB2.0 all over again.
I added the following slides and I will follow up with example code to demonstrate how to work successfully with OSGi if it is forced on you.
http://www.slideshare.net/ielian/tdd-on-osgi
No, OSGI will make you grey early.

How are very large Java EE web sites structured?

A best practice question - how are very large websites best structured with Java.
I'm interested in knowing how the deployments themselves are structured -
Some possible answers:
A single Ear - with/without session
sharing in between the constituent
wars?
Multiple Wars - with/without
session sharing?
Multiple modules
that are assembled into one big War
at deployment time?
Is there any documented best practice for this?
If you could find data, I'd bet you'd have examples of each one (and perhaps more besides).
I don't know that there's a "best practice" that's uniformly followed.
Your biggest concerns appear to be session sharing and deployment. Regardless of how it's done, I'd say that session data ought to be minimized, and sharing between WARs? No. One owner for data, please. Sharing suggests that you've spread functionality for a single use case across modules. This will lead to grief someday.
As far as packaging goes, I'd say that the bigger the package the more code is affected by changes. If you can partition something into two independent WARs, you can change one without bringing the other down. That's better for maintenance.
When you say "very large websites", I assume website that have too many modules and sub modules. Not the site with heavy traffic. If you are intrested in site with heavy traffic, please rephrase and post a different question or update this one.
Keep it modular with multiple EAR's/WAR's. Having it modular gives opportunity for each module to evolve independently of other unless there is change in module interface. If there is change in interface, dependent module needs to be updated as well. Although not a website, one of such example is Eclipse IDE. Each of it's modules are developed and maintained independently and has it's own versions. Having website modular also gives opportunity to deploy individual modules on separate machines/servers and hence can scale individually.
Session sharing across EAR's/WAR's will be considered bad idea. It is too much of a work for server. Further it can introduces bugs that will be hard to debug just like global variables. But having user to log in again and again for each module is also really bad. You would need to implement some "Single Sign On (SSO)" solution for this. As far as example is concerned, www.google.com, www.gmail.com, www.orkut.com etc. are all google services and each service is like an individual module. But if you sign into one of the service and then open another without signing out, you automatically get logged in.
Although modules modules that are assembled into one big War is a bad idea, Once a year you can have all modules being deployed together (not as single WAR but individually) and name it something. Similar thing can be seen with eclipse. Eclipse has time to time updates for each individual modules but one a year they have a major release where by all modules are upgraded (Europa, Ganymede, Galileo ...).
Every application is different and has different requirements. There is no specific best practices for large website as it would depend on the website being developed. Say for example session sharing won't be a good practice but business requirement may drive you to do so. Or some alternate method could be used to share information across modules.

Categories

Resources