I'm reading traffic management documentation and also use Istio for chaos testing.
I know we can use some headers value to routing traffic for AB testing, but what I would like to know is if I can do the same to return an error in one service or make a delay, as the Istio chaos testing provides.
So I can end up making request with header request-type:chaos, and for those we will apply the chaos testing YAML config, but not for the rest of request.
The RouteRule (used in the second link you sent) is deprecated and replaced by VirtualService:
VirtualService, DestinationRule, and ServiceEntry replace RouteRule, DestinationPolicy, and EgressRule respectively.
Good instructions on how to configure an HTTP delay fault is presented in this -> Fault Injection documentation.
Example YAML config for your header (request-type:chaos) based on Ingress Gateways and Fault Injection documentations:
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: httpbin
spec:
hosts:
- "httpbin.example.com"
gateways:
- httpbin-gateway
http:
- fault:
delay:
fixedDelay: 7s
percent: 100
match:
- headers:
request-type:
exact: chaos
route:
- destination:
port:
number: 80
host: my-echo-service
- route:
- destination:
port:
number: 80
host: my-nginx-service
EDIT: Should work as well for apiVersion: networking.istio.io/v1beta1.
Related
Spring Cloud Gateway has a ForwardRoutingFilter
The ForwardRoutingFilter looks for a URI in the exchange attribute ServerWebExchangeUtils.GATEWAY_REQUEST_URL_ATTR. If the URL has a forward scheme (such as forward:///localendpoint), it uses the Spring DispatcherHandler to handle the request.
I tried the following
exchange.getAttributes().put(GATEWAY_REQUEST_URL_ATTR, URI.create("forward:///debug"));
return chain.filter(exchange);
But it seems to still be routing to the original service rather than my endpoint
The routes are defined by discovery as follows
spring:
cloud:
gateway:
discovery:
locator:
enabled: true
predicates:
- name: Path
args:
patterns: "metadata['path']"
filters:
- RemoveRequestHeader='Cookie'
# Prevent propagation of traces
- RemoveRequestHeader='X-B3-TraceId'
- RemoveRequestHeader='X-Trace-ID'
- name: Retry
args:
retries: 3
methods:
- GET
- name: RewritePath
args:
regexp: "metadata['path.regexp']"
replacement: "metadata['path.replacement']"
- name: CircuitBreaker
args:
name: "'resilience'"
fallbackUri: "'forward:/unavailable'"
# The Grpc is the one that contains the filter
- name: Grpc
args:
serviceId: serviceId
host: host
useGrpc: "metadata['protocol'] == 'grpc'"
port: port
I am using the spring gateway implementation "org.springframework.cloud:spring-cloud-starter-gateway" RequestRateLimiter to limit the request rate, this is the RequestRateLimiter config:
- filters:
- name: JwtAuthentication
- name: RequestRateLimiter
args:
redis-rate-limiter.replenishRate: 1
redis-rate-limiter.burstCapacity: 2
id: time-capsule-service
order: 2
predicates:
- Path=/tik/**
uri: http://10.111.149.10:11015
Now I am facing a problem that all the requests return 403 http code. I am tracing the source code and found the key resolver gets an empty key. Must I customize the key by myself? BTW, I have already configured the redis server address.
I am running on kubernetes job (job-1) from base pod. It works for basic use case. For second use case, I want trigger another kubernetes job(job-2) from already running job: job-1. While running job-2 I get service account error as given below:
Error occurred while starting container for Prowler due to exception : Failure executing: POST at: https://172.20.0.1/apis/batch/v1/namespaces/my-namespace/jobs. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. jobs.batch is forbidden: User "system:serviceaccount:my-namespace:default" cannot create resource "jobs" in API group "batch" in the namespace "my-namespace".
I have created service account with required permissions as given below:
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: my-sa-service-role-binding
subjects:
- kind: ServiceAccount
name: my-sa
namespace: my-namespace
roleRef:
kind: Role
name: my-namespace
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: my-sa-service-role
rules:
- apiGroups: [""]
resources: ["secrets", "pods"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["batch", "extensions"]
resources: ["jobs"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get", "list"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-sa
I am passing "my-sa" as service account name but still, it refers to default service account.
I am using fabric8io kubernetes client to trigger the job and below is my code:
final Job job = new JobBuilder()
.withApiVersion("batch/v1")
.withNewMetadata()
.withName("demo")
.withLabels(Collections.singletonMap("label1", "maximum-length-of-63-characters"))
.withAnnotations(Collections.singletonMap("annotation1", "some-annotation"))
.endMetadata()
.withNewSpec().withParallelism(1)
.withNewTemplate()
.withNewSpec().withServiceAccount("my-sa")
.addNewContainer()
.withName("prowler")
.withImage("demo-image")
.withEnv(env)
.endContainer()
.withRestartPolicy("Never")
.endSpec()
.endTemplate()
.endSpec()
.build();
If you see the error message in detail, you'll find that your client is not using the service account you created (my-sa). Its using the default service account in the namespace instead:
"system:serviceaccount:my-namespace:default" cannot create resource "jobs"
And it should be safe to assume, that the default service account will not be having the privileges to create jobs.
It should be worthwhile to look into the official documentation of fabric8io, to see how you can authenticate with a custom service-account. From what I could find in the docs, it should be mostly handled by mounting the secret, corresponding to the service-account into the pod, then configuring your application code or probably setting up an specific environment variable.
I have configured Zuul with Eureka in a way, that 3 identical instances of a service are working parallely. I am calling the gateway on the port 8400, which routes incoming requests to ports 8420, 8430 and 8440 in a round-robin manner. It works smoothly. Now, if I switching off one of the 3 services, a small amount of incoming requests will go wrong with the following exception:
com.netflix.zuul.exception.ZuulException: Filter threw Exception
=> 1: java.util.concurrent.FutureTask.report(FutureTask.java:122)
=> 3: hu.perit.spvitamin.core.batchprocessing.BatchProcessor.process(BatchProcessor.java:106)
caused by: com.netflix.zuul.exception.ZuulException: Filter threw Exception
=> 1: com.netflix.zuul.FilterProcessor.processZuulFilter(FilterProcessor.java:227)
caused by: org.springframework.cloud.netflix.zuul.util.ZuulRuntimeException: com.netflix.zuul.exception.ZuulException: Forwarding error
=> 1: org.springframework.cloud.netflix.zuul.filters.route.RibbonRoutingFilter.run(RibbonRoutingFilter.java:124)
caused by: com.netflix.zuul.exception.ZuulException: Forwarding error
=> 1: org.springframework.cloud.netflix.zuul.filters.route.RibbonRoutingFilter.handleException(RibbonRoutingFilter.java:198)
caused by: com.netflix.client.ClientException: com.netflix.client.ClientException
=> 1: com.netflix.client.AbstractLoadBalancerAwareClient.executeWithLoadBalancer(AbstractLoadBalancerAwareClient.java:118)
caused by: java.lang.RuntimeException: org.apache.http.NoHttpResponseException: scalable-service-2:8430 failed to respond
=> 1: rx.exceptions.Exceptions.propagate(Exceptions.java:57)
caused by: org.apache.http.NoHttpResponseException: scalable-service-2:8430 failed to respond
=> 1: org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:141)
My Zuul routing looks like this:
### Zuul routes
zuul.routes.scalable-service.path=/scalable/**
#Authorization header will be forwarded to scalable-service
zuul.routes.scalable-service.sensitiveHeaders: Cookie,Set-Cookie
zuul.routes.scalable-service.serviceId=template-scalable-service
It takes a while until Eureka discovers the service is not available any more.
My question is: Is there a possibility, to configure Zuul so that in case of a NoHttpResponseException, it forwards the requests to another available instance in the pool?
Eureka, by default, requires lease to be renewed every 90s. That is, if a service instance doesn't get its lease renewed in 90s, Eureka server will evict the instance. In your case, the instance has not been evicted yet - the renew window for the instance was valid.
For this, you can decrease the renew duration through config setup at eureka client and eureka server as described here.
Note: If you hit the actuator /shutdown endpoint, the instance is immediately evicted
Finally I found the solution to the problem. The appropriate search phrase was 'fault tolerance'. The key is the autoretry config in the following application.properties file. The value of template-scalable-service.ribbon.MaxAutoRetriesNextServer must be set at least to 6 in case of 3 pooled services to achieve full fault tolerance. With that setup I can kill 2 of 3 services any time, no incoming request will go wrong. Finally I have set it to 10, there is no unnecessary increase of timeout, hystrix will break the line.
### Eureka config
eureka.instance.hostname=${hostname:localhost}
eureka.instance.instanceId=${eureka.instance.hostname}:${spring.application.name}:${server.port}
eureka.instance.non-secure-port-enabled=false
eureka.instance.secure-port-enabled=true
eureka.instance.secure-port=${server.port}
eureka.instance.lease-renewal-interval-in-seconds=5
eureka.instance.lease-expiration-duration-in-seconds=10
eureka.datacenter=perit.hu
eureka.environment=${EUREKA_ENVIRONMENT_PROFILE:dev}
eureka.client.serviceUrl.defaultZone=${EUREKA_SERVER:https://${server.fqdn}:${server.port}/eureka}
eureka.client.server.waitTimeInMsWhenSyncEmpty=0
eureka.client.registry-fetch-interval-seconds=5
eureka.dashboard.path=/gui
eureka.server.enable-self-preservation=false
eureka.server.expected-client-renewal-interval-seconds=10
eureka.server.eviction-interval-timer-in-ms=2000
### Ribbon
ribbon.IsSecure=true
ribbon.NFLoadBalancerPingInterval=5
ribbon.ConnectTimeout=30000
ribbon.ReadTimeout=120000
### Zuul config
zuul.host.connectTimeoutMillis=30000
zuul.host.socketTimeoutMillis=120000
zuul.host.maxTotalConnections=2000
zuul.host.maxPerRouteConnections=200
zuul.retryable=true
### Zuul routes
#template-scalable-service
zuul.routes.scalable-service.path=/scalable/**
#Authorization header will be forwarded to scalable-service
zuul.routes.scalable-service.sensitiveHeaders=Cookie,Set-Cookie
zuul.routes.scalable-service.serviceId=template-scalable-service
# Autoretry config for template-scalable-service
template-scalable-service.ribbon.MaxAutoRetries=0
template-scalable-service.ribbon.MaxAutoRetriesNextServer=10
template-scalable-service.ribbon.OkToRetryOnAllOperations=true
#template-auth-service
zuul.routes.auth-service.path=/auth/**
#Authorization header will be forwarded to scalable-service
zuul.routes.auth-service.sensitiveHeaders=Cookie,Set-Cookie
zuul.routes.auth-service.serviceId=template-auth-service
# Autoretry config for template-auth-service
template-auth-service.ribbon.MaxAutoRetries=0
template-auth-service.ribbon.MaxAutoRetriesNextServer=0
template-auth-service.ribbon.OkToRetryOnAllOperations=false
### Hystrix
hystrix.command.default.execution.timeout.enabled=false
Beside of this, I have a profile specific setup in application-discovery.properties
#Microservice environment
eureka.client.registerWithEureka=false
eureka.client.fetchRegistry=true
spring.cloud.loadbalancer.ribbon.enabled=true
I start my server in a docker container like this:
services:
discovery:
container_name: discovery
image: template-eureka
environment:
#agentlib for remote debugging
- JAVA_OPTS=-DEUREKA_SERVER=https://discovery:8400/eureka -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005
- TEMPLATE_EUREKA_OPTS=-Dspring.profiles.active=default,dev,discovery
- EUREKA_ENVIRONMENT_PROFILE=dev
ports:
- '8400:8400'
- '5500:5005'
networks:
- back-tier-net
- monitoring
hostname: 'discovery'
See the complete solution in GitHub.
I have an ft-admin microservive which exposes 2 endpoints app-config and app-analytic. In my #EnableZuulProxy gateway project, I can define the routing rules and specify other configurations for Ribbon and Hystrix on microservice-level using the serviceId as following.
zuul:
routes:
admin-services:
path: /admin/**
serviceId: ft-admin
stripPrefix: true
ft-admin:
ribbon:
ActiveConnectionsLimit: 2
hystrix:
command:
ft-admin:
execution:
isolation:
thread:
timeoutInMilliseconds: 10000
I'm wondering if there's a way to bring the above configurations down to endpoint-level for each app-config and app-analytic individually. The goal is to be able to give a different setting for each endpoint as following.
zuul:
routes:
app-config-endpoint:
path: /app-config/**
serviceId: ft-admin
stripPrefix: false
app-analytic-endpoint:
path: /app-analytic/**
serviceId: ft-admin
stripPrefix: false
app-config-endpoint:
ribbon:
ActiveConnectionsLimit: 5
app-analytic-endpoint:
ribbon:
ActiveConnectionsLimit: 2
...
When I run my gateway project in Debug mode with ft-admin.ribbon.ActiveConnectionsLimit: 2, I can see the following lines in the log.
c.netflix.config.ChainedDynamicProperty : Property changed: 'ft-admin.ribbon.ActiveConnectionsLimit = 2'
c.netflix.config.ChainedDynamicProperty : Flipping property: ft-admin.ribbon.ActiveConnectionsLimit to use it's current value:2
However, when I run my project with app-config-endpoint.ribbon.ActiveConnectionsLimit: 5, I see the following lines.
c.netflix.config.ChainedDynamicProperty : Property changed: 'ft-admin.ribbon.ActiveConnectionsLimit = -2147483648'
c.netflix.config.ChainedDynamicProperty : Flipping property: ft-admin.ribbon.ActiveConnectionsLimit to use NEXT property: niws.loadbalancer.availabilityFilteringRule.activeConnectionsLimit = 2147483647
I've tried to search through a ton of posts but it seems like the configurations always stop at the microservice-level. The endpoint/route level configurations are completely ignore.
I'd be very grateful if you could point me in the right direction or tell me your story if you've tried this before.