ECS Fargate Routing to 20+ Containers from ALB

ECS Fargate Routing to 20+ Containers from ALB - java

I'm running 20 + Java ElasticBeanstalk Instances that are all using Classic Load Balancers and the works (autoscaling, security groups, etc). I'm trying to figure out how to improve this setup and move it to ECS and reduce overall resource consumption.
The question I'm looking to figure out is if it's possible to have Fargate handle 20+ different host and path match conditions to 20+ different containers in a single service.
I have a POC in place where I have my containers spinning up on an ECS EC2 instance, instead of Fargate but I can see everything is working when I visit my Route53 CNAME I'm testing with.
I have traffic coming into a main ALB that's rerouting http traffic -> https. After https, the traffic is filtered through rules that route traffic based on a host and path condition.
First rule
host is hello.world.com + path is /java Then forward to helloworld-target-group
Second rule
host is new.world.com + path is /java Then forward to welcomeworld-target-group
and so on
I'm reading that a single ECS service can only have 1 ALB with a max of 5 target groups and my initial thinking was to have the 20+ target groups on a single service with an ALB and 20+ containers.
Now I'm thinking as a possible side solution, I could have 5 different services making use of the 5 target group per service limit.
The containers will all have the same docker image. The only difference will be the environmental variables on the containers. (I need the containers to each have different environmental variables so that java will know which db to use.)
Has anyone looked at a similar problem or know of a better solution?
Edit: I might be wrong or AWS has updated the routing rules but as of now the answer I've landed on is not to use Fargate to route traffic to 20+ containers. Use an ECS EC2 instance to setup 20+ target groups that all route to separate ports on the EC2 Instance. This way you can route traffic to 20+ containers from a single instance which is pretty cool.

Related

Spring Boot + k8s - autodiscovery solution

Let's say I have such architecture based on Java Spring Boot + Kubernetes:
N pods with similar purpose (lets say: order-executors) - GROUP of pods
other pods with other business implementation
I want to create solution where:
One (central) pod can communicate with all/specific GROUP of pods to get some information about state of this pods (REST or any other way)
Some pods can of course be replicated i.e. x5 and central pod should communicate with all replicas
Is it possible with any technology? If every order-executor pod has k8s service - there is a way to communicate with all replicas of this pod to get some info about all of them? Central pod has only service url so it doesn't know which replica pod is on the other side.
Is there any solution to autodiscovery every pod on cluster and communicate with them without changing any configuration? Spring Cloud? Eureka? Consul?
P.S.
in my architecture there is also deployed etcd and rabbitmq. Maybe it can be used as part of solution?

You can use a "headless Service", one with clusterIP: none. The result of that is when you do an DNS lookup for the normal service DNS name, instead of a single A result with the magic proxy mesh IP, you'll get separate A responses for every pod that matches the selector and is ready (i.e. the IPs that would normally be the backend). You can also fetch these from the Kubernetes API via the Endpoints type (or EndpointSlices if you somehow need to support groups with thousands, but for just 5 it would be Endpoints) using the Kubernetes Java client library. In either case, then you have a list of IPs and the rest is up to you to figure out :)

I'm not familiar with java, but the concept is something I've done before. There are a few approaches you can do. One of them is using kubernetes events.
Your application should listen to kubernetes events (using a websocket). Whenever a pod with a specific label or labelset has been created, been removed or in terminating state, etc. You will get updates about their state, including the pod ip's.
You then can do whatever you like in your application without having to contact the kubernetes api yourself in your application.
You can even use a sidecar pod which listens to those events and write it to a file. By using shared volumes, your application can read that file and use the content of it.

Spring Boot HTTP Client of multiple servers WITHOUT load balancer

I'm running Spring Boot services in AWS ECS using CloudMap.
Using Java 11 and Spring Boot 2.2.1.RELEASE
S(1) and S(2) are exact copies of a CPU intensive service, and C is calling them as part of servicing multiple parallel requests.
C is not resource bound, so I don't want to create more instances of it.
Calls are HTTP/REST made using com.konghq:unirest-java:jar:3.6.00, which in turn uses httpcomponents:httpclient:jar:4.5.11
Here a little diagram:
Multiple Parallel Requests ----> C (10.1.12.25) ---------> S(1) (10.1.178.143)
\
\---> S(2) (10.1.118.82)
Using Cloudmap as Service Directory, when I dig <service-name>, it returns both IPs in the answer.
;; ANSWER SECTION:
<service-name>. 60 IN A 10.1.178.143
<service-name>. 60 IN A 10.1.118.82
Because C is only one instance, S(1) is receiving 100% of requests from C. This makes me think C is somehow using only one of the IP addresses registered as .
Is it posible to make C use BOTH IP addresses to invoke <service-name> without using a Load Balancer? Maybe configuring something in Unirest and/or HttpClient?
Thanks in advance.
P.S.: This is my first question, so please be kind if not the right tag, etc. ;-)

You can use Route53 with those ip address and chose a round robin policy.
How ever if the requests are ran in parallel and you wan't to not over load any specific instance a load balancer is needed for the long run.

Best way for inter-cluster communication between microservices on Kubernetes?

I am new to microservices and want to understand what is the best way I can implement below behaviour in microservices deployed on Kubernetes :
There are 2 different K8s clusters. Microservice B is deployed on both the clusters.
Now if a Microservice A calls Microservice B and B’s pods are not available in cluster 1, then the call should go to B of cluster 2.
I could have imagined to implement this functionality by using Netflix OSS but here I am not using it.
Also, keeping aside the inter-cluster communication aside for a second, how should I communicate between microservices ?
One way that I know is to create Kubernetes Service of type NodePort for each microservice and use the IP and the nodePort in the calling microservice.
Question : What if someone deletes the target microservice's K8s Service? A new nodePort will be randomly assigned by K8s while recreating the K8s service and then again I will have to go back to my calling microservice and change the nodePort of the target microservice. How can I decouple from the nodePort?
I researched about kubedns but seems like it only works within a cluster.
I have very limited knowledge about Istio and Kubernetes Ingress. Does any one of these provide something what I am looking for ?
Sorry for a long question. Any sort of guidance will be very helpful.

You can expose you application using services, there are some kind of services you can use:
ClusterIP: Exposes the Service on a cluster-internal IP. Choosing this value makes the Service only reachable from within the cluster. This is the default ServiceType.
NodePort: Exposes the Service on each Node’s IP at a static port (the NodePort). A ClusterIP Service, to which the NodePort Service routes, is automatically created. You’ll be able to contact the NodePort Service, from outside the cluster, by requesting <NodeIP>:<NodePort>.
LoadBalancer: Exposes the Service externally using a cloud provider’s load balancer. NodePort and ClusterIP Services, to which the external load balancer routes, are automatically created.
ExternalName: Maps the Service to the contents of the externalName field (e.g. foo.bar.example.com), by returning a CNAME record
For internal communication you an use service type ClusterIP, and you could configure the service dns for your applications instead an IP.
I.e.: a service called my-app-1 could be reach internnaly using the dns http://my-app-1 or with fqdn http://my-app-1.<namespace>.svc.cluster.local.
For external communication, you can use NodePort or LoadBalancer.
NodePort is good when you have few nodes and know the ip of all of them. And yes, by the service docs you can specify a specific port number:
If you want a specific port number, you can specify a value in the nodePort field. The control plane will either allocate you that port or report that the API transaction failed. This means that you need to take care of possible port collisions yourself. You also have to use a valid port number, one that’s inside the range configured for NodePort use.
LoadBalancer give you more flexibility, because you don't need to know all node ips, you just must to know the service IP and port. But LoadBalancer is only supported in cloudproviders, if you wan to implement in bare-metal cluster, I recomend you take a look in MetalLB.
Finnaly, there is another option that is use ingress, in my point of view is the best way to expose HTTP applications externally, because you can create rules by path and host, and it gives you much more flexibility than services. But only HTTP/HTTPS is supported, if you need TCP then go to Services
I'd recommend you take a look in this links to understand in deep how services and ingress works:
Kubernetes Services
Kubernetes Ingress
NGINX Ingress

Your design is pretty close to Istio Multicluster example.
By following the steps in the Istio multicluster lab you'll get two clusters with one Istio control plane that balance the traffic between two ClusterIP Services located in two independent Kubernetes clusters.
The lab's configuration watches the traffic load, but rewriting the Controller Pod code you can configure it to switch the traffic to the Second Cluster if the Cluster One's Service has no endpoints ( all pods of the certain type are not in Ready state).
It's just an example, you can change istiowatcher.go code to implement any logic you want.
There is more advanced solution using Admiral as an Istio Multicluster management automation tool.
Admiral provides automatic configuration for an Istio mesh spanning multiple clusters to work as a single mesh based on a unique service identifier that associates workloads running on multiple clusters to a service. It also provides automatic provisioning and syncing of Istio configuration across clusters. This removes the burden on developers and mesh operators, which helps scale beyond a few clusters.
This solution solves these key requirements for modern Kubernetes infrastructure:
Creation of service DNS entries decoupled from the namespace, as described in Features of multi-mesh deployments.
Service discovery across many clusters.
Supporting active-active & HA/DR deployments. We also had to support these crucial resiliency patterns with services being deployed in globally unique namespaces across discrete clusters.
This solution may become very useful in a full scale.

Use ingress for inter cluster communication and use cluster ip type service for intra cluster communication between two micro services.

Design considerations for J2EE webapp on Tomcat in Amazon WebServices

My project is looking to deploy a new j2ee application to Amazon's cloud. ElasticBeanstalk supports Tomcat apps, which seems perfect. Are there any particular design considerations to keep in mind when writing said app that might differ from just a standalone tomcat on a server?
For example, I understand that the server is meant to scale automatically. Is this like a cluster? Our application framework tends to like to stick state in the HttpSession, is that a problem? Or when it says it scales automatically, does that just mean memory and CPU?

Automatic scaling on AWS is done via adding more servers, not adding more CPU/RAM. You can add more CPU/RAM manually, but it requires shutting down the server for a minute to make the change, and then configuring any software running on the server to take advantage of the added RAM, so that's not the way automatic scaling is done.
Elastic Beanstalk is basically a management interface for Amazon EC2 servers, Elastic Load Balancers and Auto Scaling Groups. It sets all that up for you and provides a convenient way of deploying new versions of your application easily. Elastic Beanstalk will create EC2 servers behind an Elastic Load Balancer and use an Auto Scaling configuration to add more servers as your application load increases. It handles adding the servers to the load balancer when they are ready to receive traffic, and removing them from the load balancer and deleting the extra servers when they are no longer needed.
For your Java application running on Tomcat you have a few options to handle horizontal scaling well. You can enable sticky sessions on the Load Balancer so that all requests from a specific user will go to the same server, thus keeping the HttpSession tied to the user. The main problem with this is that if a server is removed from the pool you may lose some HttpSessions and cause any users that were "stuck" to that server to be logged out of your application. The solution to this is to configure your Tomcat instances to store sessions in a shared location. There are Tomcat session store implementations out there that work with AWS services like ElastiCache (Redis) and DynamoDB. I would recommend using one of those, probably the Redis implementation if you aren't already familiar with DynamoDB.
Another consideration for moving a Java application to AWS is that you cannot use any tools or libraries that rely on multi-cast. You may not be using multi-cast for anything, but in my experience every Java app I've had to migrate to AWS relied on multi-cast for clustering and I had to modify it to use a different clustering method.
Also, for a successful migration to AWS I suggest you read up a bit on VPCs, private IP versus public IP, and Security Groups. A solid understanding of those topics is key to setting up your network so that your web servers can communicate with your DB and cache servers in a secure and performant manner.

How to route subdomains to one or more appropriate nodes within a cluster?

I am trying to solve a distributed computing architecture problem. Here is the scenario.
Users come to my website and registers. As a part of the registration process they get a subdomain. For example, foo.xyz.com.
Now each users website is located/replicated on a one or more cluster nodes using some arbitrary scheme.
When the user request comes in (HTTP request via browser) , appropriate subdomain must be redirected to the matching cluster node. Essentially, I want my own dynamic domain name. I need to implement it in a fast and efficient way.
I've a java based web application which runs inside a Jetty7 container.
thanks,
NG

This definitely should be implemented outside of your application. Your Web application should be, as much as possible, agnostic from the way that requests get balanced in a cluster. The best performance you would get would be with hardware load balancers this one for example.
If you want to go for software based balancing I would configure Apache to serve as entry point and balance the traffic for your cluster with something like mod_proxy. See this tutorial that refers to Jetty.

have you taken a look at Nginx?Nginx may be more than your needs but it does effective job of routing subdomains to particular nodes.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.