Context
We are using Spring Cloud Netflix with Eureka as the service discovery and Zuul for proxying the services and load balance them.
The microservices are implemented using NodeJS and are registered at Eureka using the NPM module eureka-js-client and a custom layer in between that handles the configuration and stuff that is generic for all microservices.
Problem
The problem is that Eureka does not recognize if one service goes down. This is a problem as we are having a development infrastructure with autodeployment that redeploys and restarts the microservices on different ports each time without restarting Eureka (and Zuul).
Therefore after a while we have ten or more instances of one microservice where only one is up but all are recognized as beeing UP.
Solution Approach
I tried setting the heartbeatInterval on the client lesser but that does not help.
I tried setting the renewalThresholdUpdateIntervalMs on the server lesser but that does not help either.
Many more frustrating, non-helping property tries…
Question
How do I configure Eureka to evict instances or to set status to DOWN of the instances that do not send a heartbeat in a reasonable time (not 30 minutes or so)?
Code Snippets
The server itself does not contain mentionable code (just a few annotations to start the Eureka server using Spring Cloud Starter).
The configuration of the Eureka server (I have removed all non-working tries):
server:
port: 8761
spring:
cloud:
client:
hostname: localhost
eureka:
instance:
address: 127.0.0.1
hostname: ${spring.cloud.client.hostname}
The client configuration that is sent to the server (using eureka-js-client):
{
instance : {
instanceId : `${CONFIG.instance.address}:${CONFIG.instance.name}:${CONFIG.instance.port}`,
app : CONFIG.instance.name,
hostName : CONFIG.instance.host,
ipAddr : CONFIG.instance.address,
port : {
'$' : CONFIG.instance.port,
'#enabled' : true
},
homePageUrl : `http://${CONFIG.instance.host}:${CONFIG.instance.port}/`,
statusPageUrl : `http://${CONFIG.instance.host}:${CONFIG.instance.port}/info`,
healthCheckUrl : `http://${CONFIG.instance.host}:${CONFIG.instance.port}/health`,
vipAddress : CONFIG.instance.name,
secureVipAddress : CONFIG.instance.name,
dataCenterInfo : {
'#class' : 'com.netflix.appinfo.InstanceInfo$DefaultDataCenterInfo',
name : 'MyOwn'
}
},
eureka : {
host : CONFIG.eureka.host,
port : CONFIG.eureka.port,
servicePath : CONFIG.eureka.servicePath || '/eureka/apps/',
healthCheckInterval : 5000
}
}
after a while we have ten or more instances of one microservice where
only one is up but all are recognized as beeing UP.
Eureka has a 'self preservation' mode. Where if less than 85% of instances heartbeats are registering, it will not evict any instances. You should be able to see a warning on the eureka dashboard.
Related
I have the following setup:
Spring Cloud Gateway on Port 8080
Routing /user to Spring Rest API (Called User) on Port 9000
Routing /character to Spring Rest API (Called Character) on Port 9001
Spring Cloud Config Server on Port 8090
The Gateway and the two Rest API applications are connected to the Cloud Config to pull the configs. The applications themselves start up correctly and can be used via their respective ports. I tested this by calling /swagger-ui.html which works fine.
When calling /actuator/gateway/routes I get a list of the routes of the gateway, which seem fine.
For all four applications I set up:
server:
address: 0.0.0.0
port: 8080 # ports are adjusted for each service
forward-headers-strategy: framework
When I did not use the Spring Cloud Config Server the Gateway was running fine. Now I'm receiving the following error whenever calling either /user/swagger-ui.html or /character/swagger-ui.html
java.lang.IllegalArgumentException: Invalid IPv4 address: 0:0:0:0:0:0:0:1:60979
at org.springframework.web.util.UriComponentsBuilder.parseForwardedFor(UriComponentsBuilder.java:363) ~[spring-web-5.3.8.jar:5.3.8]
at org.springframework.web.filter.ForwardedHeaderFilter$ForwardedHeaderExtractingRequest.<init>(ForwardedHeaderFilter.java:246) ~[spring-web-5.3.8.jar:5.3.8]
at org.springframework.web.filter.ForwardedHeaderFilter.doFilterInternal(ForwardedHeaderFilter.java:149) ~[spring-web-5.3.8.jar:5.3.8]
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119) ~[spring-web-5.3.8.jar:5.3.8]
Tried changing forward-headers-strategy to native which did not help either. I dug into the UriComponentBuilder and set some breakpoints to investigate the variables in their and I suspect request.headers["forwarded"] is the reason it fails:
It tries to resolve the "for" inside the variable which is not a IPv4 address and thus fails. Forcing ipv4 with "server.address" seemed to work before using Spring Cloud Config, does not now, however. Has anyone come across the same issue and knows what am I supposed to do to get rid off this exception?
So all in all the routing itself seems to work, as it routes to the correct application. The application itself (User in this case) throws the error, not the Gateway.
The issue is resolved with
Spring gateway version 3.0.4
which you can get from
org.springframework.cloud:spring-cloud-dependencies:2020.0.4
Problem:
how to resolve host name of kubernetes pod?
I have the Following requirement we are using grpc with java where we have one app where we are running out grpc server other app where we are creating grpc client and connecting to grpc server (that is running on another pod).
We have three kubernetes pod running where our grpc server is running.
lets say :
my-service-0, my-service-1, my-service-2
my-service has a cluster IP as: 10.44.5.11
We have another three kubernetes pod running where our gprc client is running.
lets say:
my-client-0, my-client-1, my-client-2
Without Security:
i am try to connect grpc server pod with grpc client pod and it work fine.
grpc client (POD -> my-client) ----------------> groc server(POD -> my-service)
So without security i am giving host name as my-service and it's working fine without any problem..
ManagedChannel channel = ManagedChannelBuilder.forAddress("my-service", 50052)
.usePlaintext()
.build();
With SSL Security:
if i try to connect grpc server it will throw host name not match.
we have created a certificate with wild card *.default.pod.cluster.local
it will throw the below error:
java.security.cert.CertificateException: No name matching my-service found
at java.base/sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:225) ~[na:na]
at java.base/sun.security.util.HostnameChecker.match(HostnameChecker.java:98) ~[na:na]
at java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:455) ~[na:na]
Not Working Code:
ManagedChannel channel = NettyChannelBuilder.forAddress("my-service", 50052)
.sslContext(GrpcSslContexts.forClient().trustManager(new File(System.getenv("GRPC_CLIENT_CA_CERT_LOCATION"))).build())
.build();
but if i give the host name as like this ==> 10-44-5-11.default.pod.cluster.local it will work fine correctly.
Working Code
ManagedChannel channel = NettyChannelBuilder.forAddress("10-44-5-11.default.pod.cluster.local", 50052)
.sslContext(GrpcSslContexts.forClient().trustManager(new File(System.getenv("GRPC_CLIENT_CA_CERT_LOCATION"))).build())
.build();
Now my problem is cluster ip of pod is dynamic and it will change every time during app deploy. what is the right way to resolve this host name?
is it possible if i give host name and it will return me the ip then i will append default.pod.cluster.local to hostname and try to connect to grpc server?
Addressing your pod directly is not a good solution since Kubernetes may need to move your pods around the cluster. This can occur for example because of the failing node.
To allow you clients/traffic to easy find desired containers you can place them behind a service with single static IP address. Service IP can be look up through DNS.
This is how you can connect to the service through it`s FQDN:
my-service.default.svc.cluster.local
Where my-service is your service name, default for your namespace and svc.cluster.local is a configurable cluster domain suffix used in all cluster services.
It's worth to know that you can skip svc.cluster.local suffix and even the namespace if the pods are in the same namespace. So you'll just refer to the service as my-service.
For more you can check K8s documents about DNS.
I am following Josh Long's presentation at https://www.youtube.com/watch?v=5q8B6lYhFvE&feature=youtu.be . I am following his examples. At about 34 minutes in he demos setting up the Eureka service.
This service uses the config-service. My config service is running and exposes the information the eureka service needs. I have checked my bootstrap.properties and they look correct according to his presentation. However the eureka service does not complete and does not seem to be accessing the config service. My other service does reach the config service and works fine.
The eureka service should come up at http://localhost:8761/ but does not. It is at 8080.
eureka service bootstrap.properties:
spring.application.name= eureka-service
spring.cloud.config.uri= http://localhost:8888
First error:
2016-10-13 18:43:00.088 ERROR 7464 --- [ main] c.n.d.s.t.d.RedirectingEurekaHttpClient : Request execution error
com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused: connect
Second error
2016-10-13 18:43:00.097 ERROR 7464 --- [ main] com.netflix.discovery.DiscoveryClient : DiscoveryClient_EUREKA-SERVICE/RB-64-PC.Home:eureka-service - was unable to refresh its cache! status = Cannot execute request on any known server
com.netflix.discovery.shared.transport.TransportException: Cannot execute request on any known server
Found the problem. I had accidently added the "config server" dependency into the project when I created it. I am using the site "http://start.spring.io/" as per the video. Adding the config server dependencies kept the initialization from accessing the actual config server running on another port.
I use Eureka as my service discovery and as a load balancer, and it is working fine when having two instances of a service "A", however when I stop one of this instances Eureka doesn't recognize that one of the instances is down and it keeps me showing an error page everytime the load balancer tries to use the dead instance.
I have put the enableSelfPreservation to false to prevent that but it takes Eureka up to 3 - 5 minutes to unregister that service, however I want high availability over my service and I want to perform the failover immediately in a matter of seconds. Is this possible using Eureka, if not how can I achieve to use only the alive instances when the others are dead.
I am using spring boot, here is my configuration for the Eureka server.
server:
port: 8761
eureka:
instance:
hostname: localhost
client:
registerWithEureka: false
fetchRegistry: false
serviceUrl:
defaultZone: http://${eureka.instance.hostname}:${server.port}/eureka/
server:
enableSelfPreservation: false
You should add a ribbon configuration to your application.yml. It is also recommended to set the hystrix isolation level to THREAD with a timeout set.
Note: This configuration should be in the client side (this usually means your gateway server), since Ribbon (and Spring Cloud in general) are a form of client-side load balancing.
Here's an example that I use:
hystrix:
command:
default:
execution:
isolation:
strategy: THREAD
thread:
timeoutInMilliseconds: 40000 #Timeout after this time in milliseconds
ribbon:
ConnectTimeout: 5000 #try to connect to the endpoint for 5 seconds.
ReadTimeout: 5000 #try to get a response after successfull connection for 5 seconds
# Max number of retries on the same server (excluding the first try)
maxAutoRetries: 1
# Max number of next servers to retry (excluding the first server)
MaxAutoRetriesNextServer: 2
UPDATE
The README in this repo has been updated to demonstrate the solution in the accepted answer.
I'm working with a simple example of a Spring Boot Eureka service registration and discovery based on this guide.
If I start up one client instance, it registers properly, and it can see itself through the DiscoveryClient. If I start up a second instance with a different name, it works as well.
But if I start up two instances with the same name, the dashboard only shows 1 instance running, and the DiscoveryClient only shows the second instance.
When I kill the 2nd instance, the 1st one is visible again through the dashboard and the discovery client.
Here are some more details about the steps I'm taking and what I'm seeing:
Eureka Server
Start the server
cd eureka-server
mvn spring-boot:run
Visit the Eureka dashboard at http://localhost:8761
Note that there are no 'Instances' yet registered
Eureka Client
Start up a client
cd eureka-client
mvn spring-boot:run
Visit the client directly at http://localhost:8080/
The /whoami endpoint will show the client's self-knowledge of its application name and port
{
"springApplicationName":"eureka-client",
"serverPort":"8080"
}
The /instances endpoint will take up to a minute to update, but should eventually show all the instances of eureka-client that have been registered with the Eureka Discovery Client.
[
{
"host":"hostname",
"port":8080,
"serviceId":"EUREKA-CLIENT",
"uri":"http://hostname:8080",
"secure":false
}
]
You can also visit the Eureka dashoboard again now and see it listed there.
Spin up another client with a different name
You can see that another client will be registred by doing the following:
cd eureka-client
mvn spring-boot:run -Dspring.application.name=foo -Dserver.port=8081
The /whoami endpoint will show the name foo and the port 8081.
In a minute or so, the /instances endpoint will show the information about this foo instance too.
On the Eureka dashboard, two clients will now be registered.
Spin up another client with the same name
Now try spinning up another instance of eureka-client by only over-riding the port parameter:
cd eureka-client
mvn spring-boot:run -Dserver.port=8082
The /whoami endpoint for http://localhost:8082 shows what we expect.
In a minute or so, the /instances endpoint now shows the instance running on port 8082 also, but for some reason, it doesn't show the instance running on port 8080.
And if we check the /instances endpoint on http://localhost:8080 we also now only see the instance running on 8082 (even though clearly, the one on 8080 is running since that's what we're asking for.
The Eureka dashboard only shows 1 instance of eureka-client running.
What's going on here?
Let's try killing the instance running on 8082 and see what happens.
When we query /instances on 8080, it still only shows the instance on 8082.
But a minute later, that goes away and we just see the instance on 8080 again.
The question is, why don't we see both instances of eureka-client when they are both running?
For local deployments, try to configure {namespace}.instanceId property in eureka-client.properties (or eureka.instance.metadataMap.instanceId for proper yaml file in case of Spring Cloud based setup). It's deeply rooted in the way Eureka server calculates application lists and compares InstanceInfo for the PeerAwareInstanceRegistryImpl - when no more concrete data (e.g.: instance metadata is available) they try to get the id from the hostname..
I wouldn't recommend it for AWS deployment though, cause messing around with instanceId will bring you trouble figuring out which machine hosts a particular service - on the other hand I doubt that you'll hosts two identical services on one machine, right?
In order to get all instances show up in the admin portal by setting unique euraka.instance.hostname in your Eureka configuration file.
The hostname is used as key for storing the InstanceInfo in com.netflix.discovery.shared.Application (since no UniqueIdentifier is set). So you have to use unique hostnames. When you test ribbon in this scenario you would see that the load won't be balanced.
Following application.yml is example:
server:
port: ${PORT:0}
info:
component: example.server
logging:
level:
com.netflix.discovery: 'OFF'
org.springframework.cloud: 'DEBUG'
eureka:
instance:
leaseRenewalIntervalInSeconds: 1
leaseExpirationDurationInSeconds: 1
metadataMap:
instanceId: ${spring.application.name}:${spring.application.instance_id:${random.value}}
instanceId: ${spring.application.name}:${spring.application.instance_id:${random.value}}
It's a bug before in Eureka, you can check further information in https://github.com/codecentric/spring-boot-admin/issues/134