Spring Boot + k8s - autodiscovery solution

Spring Boot + k8s - autodiscovery solution - java

Let's say I have such architecture based on Java Spring Boot + Kubernetes:
N pods with similar purpose (lets say: order-executors) - GROUP of pods
other pods with other business implementation
I want to create solution where:
One (central) pod can communicate with all/specific GROUP of pods to get some information about state of this pods (REST or any other way)
Some pods can of course be replicated i.e. x5 and central pod should communicate with all replicas
Is it possible with any technology? If every order-executor pod has k8s service - there is a way to communicate with all replicas of this pod to get some info about all of them? Central pod has only service url so it doesn't know which replica pod is on the other side.
Is there any solution to autodiscovery every pod on cluster and communicate with them without changing any configuration? Spring Cloud? Eureka? Consul?
P.S.
in my architecture there is also deployed etcd and rabbitmq. Maybe it can be used as part of solution?

You can use a "headless Service", one with clusterIP: none. The result of that is when you do an DNS lookup for the normal service DNS name, instead of a single A result with the magic proxy mesh IP, you'll get separate A responses for every pod that matches the selector and is ready (i.e. the IPs that would normally be the backend). You can also fetch these from the Kubernetes API via the Endpoints type (or EndpointSlices if you somehow need to support groups with thousands, but for just 5 it would be Endpoints) using the Kubernetes Java client library. In either case, then you have a list of IPs and the rest is up to you to figure out :)

I'm not familiar with java, but the concept is something I've done before. There are a few approaches you can do. One of them is using kubernetes events.
Your application should listen to kubernetes events (using a websocket). Whenever a pod with a specific label or labelset has been created, been removed or in terminating state, etc. You will get updates about their state, including the pod ip's.
You then can do whatever you like in your application without having to contact the kubernetes api yourself in your application.
You can even use a sidecar pod which listens to those events and write it to a file. By using shared volumes, your application can read that file and use the content of it.

Related

How to handle Kafka container lifecycle using spring kafka in Kubernetes multipod deployment

I am using Spring kafka implementation and I need to start and stop my kafka consumer through an REST API. For that i am using KafkaListenerEndpointRegistry endpointRegistry
endpointRegistry.getListenerContainer("consumer1").stop();
endpointRegistry.getListenerContainer("consumer1").start();
We are deploying the microservice on kubernetes pod so there might be multiple deployments for the same microservice. how could i manage to start and stop the consumer on all the container.

Kubernetes offers nothing to automatically broadcast an http-request to all pods for a service; so you have to do it yourself.
Broadcasting over Kafka
You can publish the start/stop command from the single instance that receives the http-request to a topic, dedicated for broadcasting commands between all instances.
Of course, you must make sure that each instance can read all message on that topic, so you need to prevent the partitions from being balanced between these instances. You can achieve that by setting a unique group-id (e.g. by suffixing your normal groupId with a UUID) on the Consumer for that topic.
Broadcasting over Http
Kubernetes knows which pods are listening on which endpoints, and you can get that information in your service. Spring Cloud Kubernetes (https://cloud.spring.io/spring-cloud-static/spring-cloud-kubernetes/2.0.0.M1/reference/html/#ribbon-discovery-in-kubernetes) makes it easy to get at that information; there's probably lots of different ways to do that, with Spring Cloud Kubernetes it would go something like this:
Receive the command on the randomly selected pod, get the ServerList from Ribbon (it contains all the instances and the ip-address/port where they can be reached) for your service, and send a new http-request to each of them.
I would prefer the Kafka-approach for its robustness, the http-approach might be easier to implement, if you're already using Spring Cloud.

Best way for inter-cluster communication between microservices on Kubernetes?

I am new to microservices and want to understand what is the best way I can implement below behaviour in microservices deployed on Kubernetes :
There are 2 different K8s clusters. Microservice B is deployed on both the clusters.
Now if a Microservice A calls Microservice B and B’s pods are not available in cluster 1, then the call should go to B of cluster 2.
I could have imagined to implement this functionality by using Netflix OSS but here I am not using it.
Also, keeping aside the inter-cluster communication aside for a second, how should I communicate between microservices ?
One way that I know is to create Kubernetes Service of type NodePort for each microservice and use the IP and the nodePort in the calling microservice.
Question : What if someone deletes the target microservice's K8s Service? A new nodePort will be randomly assigned by K8s while recreating the K8s service and then again I will have to go back to my calling microservice and change the nodePort of the target microservice. How can I decouple from the nodePort?
I researched about kubedns but seems like it only works within a cluster.
I have very limited knowledge about Istio and Kubernetes Ingress. Does any one of these provide something what I am looking for ?
Sorry for a long question. Any sort of guidance will be very helpful.

You can expose you application using services, there are some kind of services you can use:
ClusterIP: Exposes the Service on a cluster-internal IP. Choosing this value makes the Service only reachable from within the cluster. This is the default ServiceType.
NodePort: Exposes the Service on each Node’s IP at a static port (the NodePort). A ClusterIP Service, to which the NodePort Service routes, is automatically created. You’ll be able to contact the NodePort Service, from outside the cluster, by requesting <NodeIP>:<NodePort>.
LoadBalancer: Exposes the Service externally using a cloud provider’s load balancer. NodePort and ClusterIP Services, to which the external load balancer routes, are automatically created.
ExternalName: Maps the Service to the contents of the externalName field (e.g. foo.bar.example.com), by returning a CNAME record
For internal communication you an use service type ClusterIP, and you could configure the service dns for your applications instead an IP.
I.e.: a service called my-app-1 could be reach internnaly using the dns http://my-app-1 or with fqdn http://my-app-1.<namespace>.svc.cluster.local.
For external communication, you can use NodePort or LoadBalancer.
NodePort is good when you have few nodes and know the ip of all of them. And yes, by the service docs you can specify a specific port number:
If you want a specific port number, you can specify a value in the nodePort field. The control plane will either allocate you that port or report that the API transaction failed. This means that you need to take care of possible port collisions yourself. You also have to use a valid port number, one that’s inside the range configured for NodePort use.
LoadBalancer give you more flexibility, because you don't need to know all node ips, you just must to know the service IP and port. But LoadBalancer is only supported in cloudproviders, if you wan to implement in bare-metal cluster, I recomend you take a look in MetalLB.
Finnaly, there is another option that is use ingress, in my point of view is the best way to expose HTTP applications externally, because you can create rules by path and host, and it gives you much more flexibility than services. But only HTTP/HTTPS is supported, if you need TCP then go to Services
I'd recommend you take a look in this links to understand in deep how services and ingress works:
Kubernetes Services
Kubernetes Ingress
NGINX Ingress

Your design is pretty close to Istio Multicluster example.
By following the steps in the Istio multicluster lab you'll get two clusters with one Istio control plane that balance the traffic between two ClusterIP Services located in two independent Kubernetes clusters.
The lab's configuration watches the traffic load, but rewriting the Controller Pod code you can configure it to switch the traffic to the Second Cluster if the Cluster One's Service has no endpoints ( all pods of the certain type are not in Ready state).
It's just an example, you can change istiowatcher.go code to implement any logic you want.
There is more advanced solution using Admiral as an Istio Multicluster management automation tool.
Admiral provides automatic configuration for an Istio mesh spanning multiple clusters to work as a single mesh based on a unique service identifier that associates workloads running on multiple clusters to a service. It also provides automatic provisioning and syncing of Istio configuration across clusters. This removes the burden on developers and mesh operators, which helps scale beyond a few clusters.
This solution solves these key requirements for modern Kubernetes infrastructure:
Creation of service DNS entries decoupled from the namespace, as described in Features of multi-mesh deployments.
Service discovery across many clusters.
Supporting active-active & HA/DR deployments. We also had to support these crucial resiliency patterns with services being deployed in globally unique namespaces across discrete clusters.
This solution may become very useful in a full scale.

Use ingress for inter cluster communication and use cluster ip type service for intra cluster communication between two micro services.

ECS Fargate Routing to 20+ Containers from ALB

I'm running 20 + Java ElasticBeanstalk Instances that are all using Classic Load Balancers and the works (autoscaling, security groups, etc). I'm trying to figure out how to improve this setup and move it to ECS and reduce overall resource consumption.
The question I'm looking to figure out is if it's possible to have Fargate handle 20+ different host and path match conditions to 20+ different containers in a single service.
I have a POC in place where I have my containers spinning up on an ECS EC2 instance, instead of Fargate but I can see everything is working when I visit my Route53 CNAME I'm testing with.
I have traffic coming into a main ALB that's rerouting http traffic -> https. After https, the traffic is filtered through rules that route traffic based on a host and path condition.
First rule
host is hello.world.com + path is /java Then forward to helloworld-target-group
Second rule
host is new.world.com + path is /java Then forward to welcomeworld-target-group
and so on
I'm reading that a single ECS service can only have 1 ALB with a max of 5 target groups and my initial thinking was to have the 20+ target groups on a single service with an ALB and 20+ containers.
Now I'm thinking as a possible side solution, I could have 5 different services making use of the 5 target group per service limit.
The containers will all have the same docker image. The only difference will be the environmental variables on the containers. (I need the containers to each have different environmental variables so that java will know which db to use.)
Has anyone looked at a similar problem or know of a better solution?
Edit: I might be wrong or AWS has updated the routing rules but as of now the answer I've landed on is not to use Fargate to route traffic to 20+ containers. Use an ECS EC2 instance to setup 20+ target groups that all route to separate ports on the EC2 Instance. This way you can route traffic to 20+ containers from a single instance which is pretty cool.

Java health monitoring in clustered environment

I am working on back end service, which is running in clustered environment (running three instance in parallel to distribute some calculation job). I am using hazel cast for creating cluster and distributing jobs.
I want to create rest end point to do some health checks of the service. As this service is in clustering mode, i need to check health check in all instances.
How would i achieve this kind of health check across cluster?
Is there any library available which is recommended for this?

One approach is to “push” health indicators to a db (all instances need to know or “discover” the db).
Another approach is to use consul (or similar solutions) to register services with health checks.
Consul has a few java clients you can choose from.

Java platform has JMX feature, you need to implement JMX beans for your services which will provide application metrics. Then you can use one of the existing solutions to monitor JMX metrics (Zabbiz, Grafana, ELK etc.) or implement your own service which will poll or consumes JMX data from the each instance in your cluseter and provide access to this data via rest api.

How to make Eureka really work on Cloud Foundry (cf)?

In cloud foundry all apps are accessible via cloud foundry load balancers.
Each load balancer has an url under which to call the app.
Even though there is a hidden way (e.g. X-CF-APP-INSTANCE),
it is not meant to call an app instance directly.
Eureka is highly available and partition tolerant, but it lacks of consistency (CAP theorem).
To overcome the staleness of registry data, Netflix uses client side load balancing (Ribbon).
(see: https://github.com/Netflix/eureka/wiki/Eureka-2.0-Architecture-Overview)
Because an app in cloud foundry is called via its load balancer, it registers itself with the
address of the load balancer at Eureka. As stated before, the app instance does not even have an address.
To make it more visible, let's say there are two instances of an app A (a1 and a2) that register themself at Eureka.
Eureka now has two entries for the app A but both having the same address assigned.
Now when Ribbon takes place to overcome the consistency problem of Eureka, it is very likely that a retry is directed to the same instance as the first try.
Failover using Ribbon is therefore not working with Eureka in cloud foundry.
The fact that all instances of an app are assigned to the same address in Eureka, makes things complicated in many situations.
Even the replication of Eureka we could solve with a workaround only. Turbine needs to be fed in push mode etc.
We are thinking about enhancing Eureka so that it sets the X-CF-APP-INSTANCE header.
Now before doing that, we wanted to know whether someone knows an easier way to make Eureka really work on cloud foundry?
This question is realated to: Deploying Eureka/ribbon code to Cloud Foundry

I think Eureka, and any other discovery service, has no place in a PaaS like CloudFoundry. Although it sounds appealing to enhance Eureka to support the X-CF-APP-INSTANCE header, you would also need to enhance the client part (Ribbon) to take advantage of that information, and add that header to each request.
Well, it's 9 months later, maybe you have done that in the meantime? Or you follow an alternative solution path?
Anyway, there's an additional app to app integration option in the meantime, the container to container networking. And even here, the CloudFoundry dev team decided to provide their own discovery mechanism.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.