We have just started a new project using Spring boot, which will have a monolithic architecture. There are some talks about using docker for containerizing the application.
Are there any benefits other than easier deployment across different platforms?
I would also like to understand whether auto scaling applies here. If yes, how?
Thanks in advance!
I just would like to add that lots of people tend to focus on the deployment, but the advantages for cyber security are enormous. The process isolation in the bounds of what could be seen as an advanced jail by itself could make the case for Docker.
Another advantage is the complement it provides for you CI/CD efforts and methodologies. By including the process of building imagens in the app building process, one gets better control of the overall process including a better view of the cycles.
Besides that, you also expand the number of ecosystems where you application can be easily installed and run. Added the support of a swarm or kubernets you got yourself access the the current hardened and managed could solutions.
With a stretch we can talk about scalability, if your image is meant to cooperate with replicas of itself, or if you put your containers were the hardware is itself elastic. Scalability also comes in to the discussion when you use means to control hardware usage in order to prevented services for competing for resources. This is also true if you do not have a cluster, as you can also manage hardware usage within a host.
Now it really depends on your needs and ninche. Some environments for instance would benefit in obvious ways even if scalability is not a concern. The inner networks you can create for instance is an excellent excuse to implement Docker, you get process isolation and network isolation inside a small host. Of course Docker is not meant to be a cybersec solution, but it adds up to the ones you already have.
I think it really depends on the scale of your application. The main benefit will certainly be ease of deployments and development, either on premise or on a cloud provider.
If you are running other applications along with Spring, like a database, cache server or other applications, you should have a look at docker-compose. It would really simplify not just the deployment of the Spring app, but also of all its dependencies.
Docker could help a lot also in case you plan to scale your application to multiple nodes, using docker swarm.
As for autoscaling, it is not really supported by docker out of the box, but you could achieve it with other tools on top of docker swarm.
Related
I have been testing out microservices lately using springboot to create microservice projects. The more i understand about the setup the more questions i am confronted with.
How are all the microservices that are running, managed? How do developers manage, deploy or update microservices via a central location?
When deploying multiple instances of a microservice, do you leave the port to be decided during runtime, or should it be predefined?
I am sure there will be much more questions popping up later.
Links used:
http://www.springboottutorial.com/creating-microservices-with-spring-boot-part-1-getting-started
https://fernandoabcampos.wordpress.com/2016/02/04/microservice-architecture-step-by-step-tutorial/
Thanks in advance.
Microservices do tend to go out of control sooner than later. With so many services floating around, you need to think of deployment and monitoring strategies ahead of time.
Both of these are not an easy problem, but you have quite a few tools available at your disposal.
Start with CI/CD. Search around it and you will find a way around. One option is to make use of Jenkins for Blue/Green deployments
In this case jenkins will be one central place where you manage your deployments (but this is just an example, we do have quite a lot of tools build around this that may help you better based on your needs)
Other part of this problem lies in when where you tend to deploy stuff? Different cloud providers have their own specific ways of handling microservices and it depends on your host really. But one alternative is to make use containers.
If you go with raw containers like dockers directly you will have to take care of mapping ports (if they are deployed on same host machine) but then you can use abstraction on top of this like if you are on AWS then you can consider ECS or docker swarms or I personally prefer Kubernetes. You do not need to worry about the ports on which they are and can directly talk to your service over a load balancer. There is lot that is missing in here and you really need to pick one such tool and dig deep, but there are options out there for you to explore.
Next is monitoring, if you are going with kubernetes, you do get lot of monitoring tools out of the box that will help you access the service logs query them etc. But you also need to make sure that from development perspective you do provide for correlations id's, api metrics, response times, because you will need them to debug issues when its comes to microservices specially one related with latencies. If you are not on kubernetes you can still get all these features added but individually, like ELK stack for log monitoring (as you do not want to go to each service to check for logs), zipkin for tracking , API gateway and loadbalancers for service discovery and talking to containers.
Hope this helps you get started.
you can start with the following:
Monitoring :
Start with spring-boot-admin and prometheus.
https://github.com/codecentric/spring-boot-admin
Deployment:
Start with docker and docker-compose and move to kubernetes.
Few examples for docker compose:
https://github.com/jinternals/spring-cloud-stream
https://github.com/jinternals/spring-micrometer-demo
There are container services/container management systems available for example Amazon ECS, Azure container services, Kubernetes etc which take care of automated deployment by centralised repositories like Amazon ECR etc, automated scale up/down of microservice instances, take advantage of dynamic port allocations to run multiple instance of same service on a single instance/host and also give you a centralised dashboard to monitor resource usage and infrastructure events.
You can make use of any one to get answers to all of your questions as all of them provide most of the functionalities needed for managing your microservices.
We have a quite large monolithic app (java/spring) and we are considering splitting it up to microservices and using spring-cloud to utilize existing solution for some common problems (discovery, redundancy etc.). Currently we run one instance (with different modules) per client.
Some of our clients are small and one VPS handles it and others are larger and might use multiple servers.
The problem is that this "pack" of microservices should be isolated for each environment - they might be slightly different.
As I am reading through resources about Cloud Foundry - which looks really great - it seems that it would be best to run an cloud foundry instance per client and I am afraid that that is overkill and quite a lot of work to get one client running (which I would like to automate as much as possible).
Ideal Solution
BEGIN
We provide servers with heterogenous OS, possible containers (VM/docker/jail/...) with restrictions where they may rur and finally services with restrictions in which containers they may run.
When creating new environment I just provide list of services to run in it and the Solution creates containers, deploys services in them and sets up communication channels (message broker) between them.
It should also handle upgrades, monitoring, etc.
END
What approach would you recommend? Or please could you share your experience from building similar thing?
Thanks
You could provide each customer with their own space in a single CF instance where all the microservices are deployed.
There seems to be a current trend in java space to move away from deploying java web applications to a java servlet container (or application server) in the form of a war file (or ear file) and instead package the application as an executable jar with an embedded servlet/HTTP server like jetty. And I mean this more so in the way newer frameworks are influencing how new applications are developed and deployed rather than how applications are delivered to end users (because, for example, I get why Jenkins uses an embedded container, very easy to grab and go). Examples of frameworks adopting the executable jar option:
Dropwizard, Spring Boot, and Play (well it doesn't run on a servlet container but the HTTP server is embedded).
My question is, coming from an environment where we have deployed our (up to this point mostly Struts2) applications to a single tomcat application server, what changes, best practices, or considerations need to be made if we plan on using an embedded container approach? Currently, we have about 10 homegrown applications running on a single tomcat server and for these smallish applications
the ability to share resources and be managed on one server is nice. Our applications are not intended to be distributed to end users to run within their environment. However, moving forward if we decide to leverage a newer java framework, should this approach change? Is the shift to executable jars spurred on by the increasing use of cloud deployments (e.g., Heroku)?
If you've had experience managing multiple applications in the Play style of deployment versus traditional war file deployment on a single application server, please share your insight.
An interesting question. This is just my view on the topic, so take everything with a grain of salt. I have occasionally deployed and managed applications using both servlet containers and embedded servers. I'm sure there are still many good reasons for using servlet containers but I will try to just focus on why they are less popular today.
Short version: Servlet containers are great to manage multiple applications on a single host but don't seem very useful to manage just one single application. With cloud environments, a single application per virtual machine seems preferable and more common. Modern frameworks want to be cloud compatible, therefore the shift to embedded servers.
So I think cloud services are the main reason for abandoning servlet containers. Just like servlet containers let you manage applications, cloud services let you manage virtual machines, instances, data storage and much more. This sounds more complicated, but with cloud environments, there has been a shift to single app machines. This means you can often treat the whole machine like it is the application. Each application runs on a machine with appropriate size. Cloud instances can pop up and vanish at any time which is great for scaling. If an application needs more resources, you create more instances.
Dedicated servers on the other hand usually are powerful but with a fixed size, so you run multiple applications on a single machine to maximize the use of resources. Managing dozens of application - each with their own configurations, web servers, routes and connections etc. - is not fun, so using a servlet container helps you to keep everything manageable and yourself sane. It is harder to scale though. Servlet containers in the cloud don't seem very useful. They would have to be set up for each tiny instance, without providing much value since they only manage a single application.
Also, clouds are cool and non-cloud stuff is boring (if we still believe the hype). Many frameworks try to be scalable by default, so that they can easily be deployed to the clouds. Embedded servers are fast to deploy and run so they seem like a reasonable solution. Servlet containers are usually still supported but require a more complicated set up.
Some other points:
The embedded server could be optimized for the framework or is better integrated with the frameworks tooling (like the play console for example).
Not all cloud environments come with customizable machine images. Instead of writing initialization scripts to download and set up servlet containers, using dedicated software for cloud application deployments is much simpler.
I have yet to find a Tomcat setup that doesn't greet you with a perm gen space error every few redeployments of your app. Taking a bit longer to (re-)start embedded servers is no problem when you can almost instantly switch between staging and production instances without any downtime.
As already mentioned in the question, it's very convenient for the end user to just run the application.
Embedded servers are portable and convenient for development. Today everything is rapid, prototypes and MVPs need to be created and delivered as fast as possible. No one wants to spend too much time setting up an environment for every developer.
I've created my first Play application. Which is the most suitable deployment method for production? Should i copy the whole project to the production server and run play start? or should i make a war out of my application and deploy in tomcat / jboss? Which is the most recommended way? Getting confused with it comparing to its rails type of behavior. Note that this is supposed to be a big data application and also it may server loaded requests later on. So we are thinking of scalability, availability, performance aspects too. This application is decided to be deployed in a cloud.
Thanks.
As others have stated, using the dist command is the easiest way to deploy Play for a one-off application. However, to elaborate, I have here some other options and my experience with them:
When I have an app that I update frequently, I usually install Play on the server and perform updates through Git. Doing so, after every update, I simply run play stop (to stop the running server), sometimes I then run play clean to clear out any potentially corrupted libraries or binaries, then I run play stage to ensure all prerequisites are present and to perform compilation, and then finally play start to run the server for the updated app. It seems like a lot, but it is easy to automate via a quick bash script.
Another method is to deploy Play behind a front-end web server such as Apache, Nginx, etc. This is mostly useful if you want to perform some sort of load balancing, but not required as Play comes bundled with its own server. Docs: http://www.playframework.com/documentation/2.1.1/HTTPServer
Creating a WAR archive using the play2war plugin is another way to deploy, but I wouldn't recommend it unless you are giving it to someone who already has a major infrastructure built upon these servlet containers you mentioned (as many large companies do). Using a servlet containers adds a level of complexity that Play is supposed to remove by nature (hence the integrated server). There are no notable performance gains that I am aware of using this method over the two previously described.
Of course, there is always the play dist which creates the package for you, which you upload to your server and run play start from there. This is probably the easiest option. Docs: http://www.playframework.com/documentation/2.1.1/ProductionDist
For performance and scalability, the Netty server in Play will function very adequately to exceptional for what you require. Here's a reputable link showing Netty with the fastest performance of all frameworks and a "stock" Play app as coming in somewhere in the middle of the field, but way ahead of Rails/Django in terms of performance: http://www.techempower.com/blog/2013/04/05/frameworks-round-2/.
Don't forget, you can always change your deployment architecture down the road to run behind a front-end server as described above if you need more load balancing and such for availability. That is a trivial change with Play. I still would not recommend the WAR deployment option unless, like I said, you already have a large installed base of servlet containers in use that someone is forcing you to serve your app with.
Scalability and performance also has a lot more to do with other factors as well, such as your use of caching, the database configuration, use of concurrency (which Play is good at) and the quality of the underlying hardware or cloud platform. For instance, Instagram and Pinterest serve millions of people every day on a Python/Django stack which has mediocre performance by all popular benchmarks. They mitigate that with lots of caching and high-performing databases (which is usually the bottleneck in large applications).
At the risk of making this answer too long, I'll just add one last thing. I, too, used to fret over performance and scalability, thinking I needed the most powerful stack and configuration around to run my apps. That just isn't the case any more unless you're talking like Google or Facebook scale where every algorithm has to be finely tuned as it will be bombarded a billion times every day. Hardware (or cloud) resources are cheap but developer/sysadmin time isn't. You should consider ease of use and maintainability for deployment of your app over raw performance comparisons, even though in the case of Play the best performing deployment configuration is arguably the easiest option as well.
You don't need to use Play's console for running application, it consumes some resources and it's main goal is fast launch while development stage.
The best option is using dist command as described in the doc. Thanks to this, you don't even need to install Play on the target machine, as dist creates ready to use stand-alone application containing all required elements (also build-in server, so you don't need to deploy it with WAR in any container).
If you planning to use a cloud you should also check offers ie. from Heroku, or CloudBees, which allows you to deploy your application just by... pushing changes via git repository, which is very comfortable way, check the documentation's home, scroll down to links: Deploying to... for more details.
I'm relatively new to Java EE and have already began to hear about the many different types of systems that can be clustered:
Virtual Machines (i.e. "that appliance is a cluster of VMs...")
Application servers, such as Tomcat, JBoss or GlassFish (i.e. "We're running clustered JBoss...")
Clustering APIs like Terracotta
Databases, like Oracle ("clustered database")
Cloud applications ("A cloud is basically a cluster...")
Wikipedia defines "clustering" as:
A computer cluster consists of a set of loosely connected computers that work together so that in many respects they can be viewed as a single system.
I'm wondering how clustering works for each of these "cluster types/methods" (mentioned above) and how they relate to one another.
For instance, if one could benefit from having a clustered application, he/she would probably put them on a clustered app server and then throw a cluster manager into the mix (again, like Terracotta).
But because the phrase "clustering" seems to be used in vague/ambiguous ways, I'm not seeing how each of these ties into the others ones, or if they even do. Thanks in advance to any brave StackOverflowers out there who can help me make sense of this interwoven terminology!
To me, clustering implies a number of qualities to a system but it boils down to fault tolerance -- server, networking, and data persistence. There are both loosely and tightly coupled systems and all flavors in between. Tightly coupled systems have the clustering performed at level close to the hardware. Many of the old clustering systems were more tightly coupled with the applications often not recognizing that they were clustered.
Loosely coupled systems are the norm these days with a large degree of the fault tolerance accomplished at a software level entirely. Systems in the cluster only share network connectivity to be able to accomplish fault tolerance. Usually there are some specialized load balancers which route requests to the various cluster servers using specialized hardware (sometimes just software) to accomplish this.
All of the examples you mentioned have some sort of "clustering". It is going to take a very long answer to describe the details about how each of the architectures accomplish this. For me, the differences are what comes "for free" when you use the architecture, and how much work you will have to do to get it to work optimally.
How you mix and match the solutions you've mentioned depends on what your architecture looks like and your requirements. You can have a Terracotta store for local high speed persistence and the cloud for the rest. You can use Glassfish as your application server and utilize Terracotta as your persistence layer.
Here are my thoughts about the technologies you listed:
Cloud applications ("A cloud is basically a cluster...")
Cloud applications are the easiest to work with obviously. Your only job from an architecture standpoint is to pick a good cluster provider. Certainly Amazon and Google will do it "right" in terms of fault tolerance and data integrity. There are many other players that probably do it "good enough" and are cheaper. You program to their APIs which come with their own set of limitations and expenses. One problem with cloud applications is that it most likely will be very hard to switch to a new one. Again, you might have some [large] portion of your application running on cloud servers and have some local systems for your higher latency requirements. The trend is to put most production functions in the cloud or at least start that way until you get too big or need some services they can't provide.
Clustering APIs like Terracotta
Databases, like Oracle ("clustered database")
JBoss
These 3 systems provide their own clustering capabilities. They may require you to do a lot of machine and service layer configurations to get the system running well in a production environment. I hear good things about Terracotta which is a distributed persistence layer. I've used Jgroups a lot which is under Jboss and it can be tricky to get running right but Jboss may also have some good default configurations/documentation. Oracle is most likely going to be hardest to cluster right. DBA's make a lot of money tweaking Oracle configurations.
Virtual Machines (i.e. "that appliance is a cluster of VMs...")
Application servers, such as Tomcat, GlassFish
These are the most amorphous to define in terms of clustering. Some VMs are considered "clustered" in that they share networking hardware and power backplanes but are really not clusters when compared to cloud computing certainly. As mentioned, there are some clustered hardware solutions that are very custom and will require a lot of specific domain knowledge to get running well.
I have very little experience with application servers such as Tomcat and Glassfish. We have our own clustering software on top of Jgroups and run Jetty entirely. Application servers are not, in themselves, "clustered" but packages such as Jboss and Terracotta run on top of them to provide clustering and they have internal projects which have clustering software written for them.
Hope some of this helps.
Here's a quick whack at it. How you cluster depends on what your goals are. Here are some thoughts that also tie in to GlassFish.
A cluster enables multiple instances to be managed as one since they share a common configuration. If you make a change to a configuration, such as defining a new resource, then all instances that belong to a cluster inherit that change. Deploying an application to a cluster deploys it to all instances of that cluster.
A cluster provides service availability. If one instance fails, deployed applications are still available on other instances.
A cluster can offer session availability. If an instance dies while a user has items in their shopping cart, then another instance can take ownership of handling that user's session such that the shopping cart contents are still there. The user never knows a backend server has failed.
With GlassFish, HTTP session state can be managed by GlassFish (built-in), delegated to a coherence grid, or the application can manage state itself (using terracotta, database, etc). The benefit of using the built-in capability is that it works out of the box and has gone through stress testing, QA, etc. The benefit of externalizing is that you can potentially get better scalability since you decouple session management and application logic. Externalizing lets the JVM focus on executing business logic, and uses less HEAP space since backup sessions exist elsewhere. Oracle has tested / QA'd externalizing to the Coherence Grid, and is a formal feature of the commercial Oracle GlassFish Server. If you roll your own via database, then you need to manage & QA itH yourself.