I have a Spring Boot app running with Spring Actuator enabled. I am using the Spring Actuator health endpoint to serve as the readiness and liveliness checks. All works fine with a single replica. When I scale out to 2 replicas both pods crash. They both fail readiness checks and end up in an endless destroy/re-create loop. If I scale them back in to 1 replica the cluster recovers and the Spring Boot app becomes available. Any ideas what might be causing this issue?
Here is the deployment config (the context root of the Spring Boot app is /dept):
apiVersion: apps/v1
kind: Deployment
metadata:
name: gl-dept-deployment
labels:
app: gl-dept
spec:
replicas: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
selector:
matchLabels:
app: gl-dept
template:
metadata:
labels:
app: gl-dept
spec:
containers:
- name: gl-dept
image: zmad5306/gl-dept:latest
imagePullPolicy: Always
ports:
- containerPort: 8080
livenessProbe:
httpGet:
path: /dept/actuator/health
port: 8080
initialDelaySeconds: 15
periodSeconds: 10
timeoutSeconds: 10
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /dept/actuator/health
port: 8080
initialDelaySeconds: 15
periodSeconds: 10
timeoutSeconds: 10
successThreshold: 1
failureThreshold: 5
The curl command hangs. It appears the entire minikube server hangs, dashboard quits responding
So in that case, I would guess the VM backing minikube is sized too small to handle all the items that are deployed inside it. I haven't played with minikube in order to know how much it carries over from its libmachine underpinnings, but in the case of docker-machine, one can provide --virtualbox-memory=4096 (or set an environment variable env VIRTUALBOX_MEMORY_SIZE=4096 docker-machine ...). And, of course, one should use the memory settings that correspond to the driver in use by minikube (so, HyperKit, xhyve, HyperV, whatever).
Related
I have configured docker compose for open telemetry collector, prometheus and jaeger and send data via otel agent. Jaeger is working fine but prometheus is not showing any metrics despite collector receiving metrics data.
Following is my configuration:
docker-compose.yml:
# docker-compose.yml file
version: "3.5"
services:
jaeger:
container_name: jaeger
hostname: jaeger
networks:
- backend
image: jaegertracing/all-in-one:latest
volumes:
- "./jaeger-ui.json:/etc/jaeger/jaeger-ui.json"
command: --query.ui-config /etc/jaeger/jaeger-ui.json
environment:
- METRICS_STORAGE_TYPE=prometheus
- PROMETHEUS_SERVER_URL=http://prometheus:9090
ports:
- "14250:14250"
- "14268:14268"
- "6831:6831/udp"
- "16686:16686"
- "16685:16685"
collector:
container_name: collector
hostname: collector
networks:
- backend
image: otel/opentelemetry-collector-contrib:latest
volumes:
- "./otel-collector-config.yml:/etc/otelcol/otel-collector-config.yml"
command: --config /etc/otelcol/otel-collector-config.yml
ports:
- "5555:5555"
- "6666:6666"
depends_on:
- jaeger
prometheus:
container_name: prometheus
hostname: prometheus
networks:
- backend
image: prom/prometheus:latest
volumes:
- "./prometheus.yml:/etc/prometheus/prometheus.yml"
ports:
- "9090:9090"
networks:
backend:
otel-collector-config.yml:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:5555
processors:
batch:
timeout: 1s
send_batch_size: 1
exporters:
prometheus:
endpoint: "collector:6666"
jaeger:
endpoint: "jaeger:14250" # using the docker-compose name of the jaeger container
tls:
insecure: true
service:
pipelines:
traces:
receivers: [ otlp ]
processors: [ batch ]
exporters: [ jaeger ]
metrics:
receivers: [ otlp ]
processors: [ batch ]
exporters: [ prometheus ]
prometheus.yml:
global:
scrape_interval: 1s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 1s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
scrape_configs:
- job_name: collector
scrape_interval: 1s
static_configs:
- targets: [ 'collector:6666' ] # using the name of the OpenTelemetryCollector container defined in the docker compose file
Following is my tracer.properties config used for otel agent for java:
otel.traces.exporter=otlp,logging
otel.metrics.exporter=otlp
otel.logs.exporter=none
otel.service.name=service1
otel.exporter.otlp.endpoint=http://0.0.0.0:5555
otel.exporter.otlp.protocol=grpc
otel.traces.sampler=always_on
otel.metric.export.interval=1000
I can get trace data in jaeger without any issues:
However metrics is not working:
I am also unable to see any metrics data in prometheus:
What config am I missing for this to work? Also please specify how to optimize this for production.
The bind address for the prometheus exporter is "collector:6666". This means that the created server will accept requests only on port 6666 and only from host collector. However, the host of Prometheus is different.
It's better to bind to "any address", e.g. "0.0.0.0:6666".
Also, you can use prometheusremotewrite exporter instead of prometheus. This way you will be able to see problems in the collector logs.
The Monitor tab in Jaeger requires you to set up the spanmetric processor. This processor will look at spans sent to the OpenTelemetry Collector and if the span.kind is server it will create metrics for the duration of the spans, keep it in memory until Prometheus scrapes the metrics endpoint - typically on port 8889. The Jaeger UI can then collect these metrics from Prometheus.
Without the spanmetrics processor - You will not be able to see any data in Jaeger's Monitor tab.
Look at the service performance monitoring documentation on setting up the Monitor tab as it describes these details.
Is there way to specify a custom NodePort port in a kubernetes service YAML definition?
I need to be able to define the port explicitly in my configuration file.
You can set the type NodePort in your Service Deployment. Note that there is a Node Port Range configured for your API server with the option --service-node-port-range (by default 30000-32767). You can also specify a port in that range specifically by setting the nodePort attribute under the Port object, or the system will chose a port in that range for you.
So a Service example with specified NodePort would look like this:
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
name: nginx
spec:
type: NodePort
ports:
- port: 80
nodePort: 30080
name: http
- port: 443
nodePort: 30443
name: https
selector:
name: nginx
For more information on NodePort, see this doc. For configuring API Server Node Port range please see this.
You can define static NodePort using nodeport in service.yaml file
spec:
type: NodePort
ports:
- port: 3000
nodePort: 31001
name: http
Yeah you can define all those three port by your own
apiVersion: v1
kind: Service
metadata:
name: posts-srv
spec:
type: NodePort
selector:
app: posts
ports:
- name: posts
protocol: TCP
port: 4000
targetPort: 4000
nodePort: 31515
you can actually run this command to see how you can achieve that in yaml.
kubectl create service hello-svc --tcp=80:80 --type NodePort --node-port 30080 -o yaml --dry-run > hello-svc.yaml
https://pachehra.blogspot.com/2019/11/kubernetes-imperative-commands-with.html
For those who need to use kubectl commands without creating a yaml file, you can create a NodePort service with a specified port:
kubectl create nodeport NAME [--tcp=port:targetPort] [--dry-run=server|client|none]
For example:
kubectl create service nodeport myservice --node-port=31000 --tcp=3000:80
You can check Kubectl reference for more:
https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#-em-service-nodeport-em-
from my application side, I have a function named myfunction and via this function we can call OPA using its endpoint and the OPAinput as function parameters and it gives the response back through the “data” in “function(context, data)” section. This is how I call the function.
myfunction('http://localhost:8181/v1/data/play/policy', OPAinput , {
onSuccess : function(context, data) {
var permit = data.result.permit;
Log.info('permit '+ permit);
Log.info("Successfully posted data.");
}, onFail : function(context) {
Log.info("Failed to post data");
}
});
When I tested this function by running OPA with the application locally, it worked fine.But now I have deployed OPA with the application as a sidecar container on GKE, and I tried the same thing but it doesn't work. It says that
“Cannot get property "permit" of null at jdk.scripting.nashorn/jdk.nashorn.internal.runtime.ECMAErrors.error(ECMAErrors.java:57) at jdk.scripting.nashorn/jdk.nashorn.internal.runtime.ECMAErrors.typeError(ECMAErrors.java:213………….”
This is the OPA logs
2020-06-26 15:38:22.000 IST {"level":"info","msg":"Initializing server.","insecure_addr":"","diagnostic-addrs":[],"addrs":[":8181"]}
2020-06-26 16:24:52.000 IST {"msg":"Received request.","req_path":"/v1/data/play/policy","req_id":1,"level":"info","req_method":"POST","client_addr":"127.0.0.1:39530"}
2020-06-26 16:24:52.000 IST {"resp_status":200,"level":"info","req_method":"POST","req_id":1,"client_addr":"127.0.0.1:39530","req_path":"/v1/data/play/policy","resp_bytes":2,"msg":"Sent response.","resp_duration":9.564696}
apiVersion: v1
kind: Deployment
metadata:
name: rss-site
namespace: myapp
spec:
replicas: 1
minReadySeconds: 30
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
selector:
matchLabels:
deployment: myapp
app: myapp
pod: myapp
template:
metadata:
labels:
deployment: myapp
app: myapp
pod: myapp
spec:
containers:
- name: opa
image: openpolicyagent/opa:latest
ports:
- name: http
containerPort: 8181
args:
- "run"
- "--ignore=.*" # exclude hidden dirs created by Kubernetes
- "--server"
- "/policies"
volumeMounts:
- readOnly: true
mountPath: /policies
name: example-policy
- name: myapp
image: nickchase/myapp:v1
ports:
- containerPort: 9763
protocol: TCP
volumeMounts:
- name: identity-server-conf
mountPath: /home/myapp/myapp-config-volume/repository/conf/deployment.toml
subPath: deployment.toml
serviceAccountName: "myappsvc-account"
volumes:
- name: myapp-server-conf
configMap:
name: myapp-server-conf
- name: example-policy
configMap:
name: example-policy
Could you please help me to identify this issue :(
When I tested this function by running OPA with the application locally, it worked fine.But now I have deployed OPA with the application as a sidecar container on GKE, and I tried the same thing but it doesn't work. It says that
If it works locally and not in GKE that means something is different. Since it gives back a HTTP 200 response then likely the OPA container is running OK, but that either the policy, input, or data is different than what you had running locally.
Try enabling the console decision logger via --set=decision_logs.console=true with the OPA args. This will show you in the log output for OPA what the input it received was as well as the result it sent back. That should help guide the investigation.
I would also double check that all of the policies and data have been loaded into OPA in the same way as you had locally. Differences in the directory paths can affect any *.json/*.yaml files loaded, and potentially if you had any missing or otherwise different *.rego files it would affect the result as well.
This is a Java Spring boot REST API application, using an index.html to present the UI web page to user.
When the index.html is displayed, it would trigger the logic from the Javascript/jQuery to make a REST api call(coded as below) to the backend service in the Java controller class to get 2 random generated numbers:
$.ajax({
url: "http://localhost:8080/multiplications/random"
The program is working fine when run it as a Spring Boot app in Eclipse!
However, it's not working after I used the .jar file to build a Docker image file then deployed it using Kubernetes/minikube(I'm new to Docker/Kubernetes).
here's the dockfile to build the image file using the .jar:
FROM openjdk:latest
ADD target/social-multiplication-v3-0.3.0-SNAPSHOT.jar app.jar
ENTRYPOINT ["java","-jar","app.jar"]
EXPOSE 8080
Here's the deployment.yaml file:
---
kind: Service
apiVersion: v1
metadata:
name: multiplicationservice
spec:
selector:
app: multiplication
ports:
- protocol: "TCP"
port: 80
targetPort: 8080
nodePort: 30001
type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: mdeploy
spec:
replicas: 1
selector:
matchLabels:
app: multiplication
template:
metadata:
labels:
app: multiplication
spec:
containers:
- name: multiplication
image: huxianjun/multiplication
ports:
- containerPort: 80
and the IP address of the host where the application being deployed in Kubernetes:
$ minikube ip
192.168.99.101
At the end, I can get to the index.html page from browser by folllowing URL:
http://192.168.99.101:30001/
The page is being displayed as expected - What NOT working is, the following REST api call didn't occur thus the 2 numbers not returned from the call and displayed on the page:
$.ajax({
url: "http://localhost:8080/multiplications/random"
My guess is, is it caused by the 'localhost' & the port'8080' not aligned with those port# defined in the deployment.yaml file? or even something conflict to the 'EXPOSE 8080' in the docfile?
In your case, you are calling the $.ajax command from your browser which is in your host machine, hence, those API calls will be sent from your local machine but not within your docker container.
To solve the problem, you can update the URL to be using http://192.168.99.101:30001/ like this
$.ajax({
url: "http://192.168.99.101:30001/multiplications/random"
Try run sudo lsof -i :8080, if you use linux. It'll show all your available ports. If you don't see 8080 port, your application's port aren't available and visible for your localhost. That's because Docker containers are "closed/isolated" for all external processes and files. Moreover, EXPOSE 8080 instruction in Dockerfile is not enough.
Try docker run -p 8080:8080 YOUR_CREATED_IMAGE_NAME. It will build redirection from docker container localhost:8080 to Your localhost:8080
I have 3 projects: A hystrix dashboard, a turbine server (using AMQP) and an API
When I start in development env, I set up 2 instances of the API (using port 8080 and 8081). To test the turbine aggregation, I make calls and in the dashboard, I can see Hosts: 2.
Although when I use Docker, even when the load balancer hits the 2 server, I only see one Host on the hystrix dashboard.
My assumptions:
1- as both containers start on the same port (8080), Turbine sees them as one
2- as I also dockerize RabbitMQ, this may be causing problems
here is my docker-compose.yml file
version: '2'
services:
postgres:
image: postgres:9.5
ports:
- "5432"
environment:
POSTGRES_PASSWORD: postgres
POSTGRES_USER: postgres
POSTGRES_DB: fq
volumes:
- /var/lib/postgresql
rabbitmq:
image: rabbitmq:3-management
ports:
- "5672"
- "15672"
environment:
RABBITMQ_DEFAULT_USER: turbine
RABBITMQ_DEFAULT_PASS: turbine
volumes:
- /var/lib/rabbitmq/
hystrix:
build: hystrixdashboard/.
links:
- turbine_server
ports:
- "8989:8989"
turbine_server:
build: turbine/.
links:
- rabbitmq
ports:
- "8090:8090"
persona_api:
build: persona/.
ports:
- "8080"
links:
- postgres
- rabbitmq
lb:
image: 'dockercloud/haproxy:1.5.1'
links:
- persona_api
volumes:
- /var/run/docker.sock:/var/run/docker.sock
ports:
- 80:80
my persona_api config file
spring:
application:
name: persona_api
profiles:
active: dev
rabbitmq:
addresses: 127.0.0.1:5672
username: turbine
password: turbine
useSSL: false
server:
compression.enabled: true
port: ${PORT:8080}
params:
datasource:
driverClassName: org.postgresql.Driver
username: postgres
password: postgres
maximumPoolSize: 10
poolName: fq_connection_pool
spring.jpa:
show-sql: true
hibernate:
ddl-auto: update
turbine:
aggregator:
clusterConfig: persona_api
appConfig: persona_api
---
spring:
profiles: dev
params:
datasource:
jdbcUrl: jdbc:postgresql://127.0.0.1:5432/fq
---
spring:
profiles: docker
rabbitmq:
addresses: rabbitmq:5672
params:
datasource:
jdbcUrl: jdbc:postgresql://postgres:5432/fq
I'm afraid that if I deploy it to production (on Rancher or Docker cloud), I'll see the same problem.
here is a GIF of what is happening when I set up two APIs load balanced
try:
hystrix.stream.queue.send-id=false
in your API
I do assume your problem is the RabbitMQ connection you are using. Cause the connection string you are using is localhost but actually except the RabbitMQ container on none the connection will be available on localhost. I do suggest that you inject the Rabbit host into your Spring connection using environment variables. If I read your files correct it should be ${RABBITMQ_PORT_5672_TCP_ADDR} instead of localhost. But be aware that I couldn't try. Its just an educated guess. Better you double check by doing an env inside your persona_api container when everything is running.
It's should be fixed your issue.
eureka:
instance:
prefer-ip-address: true
instance-id: ${spring.cloud.client.ipAddress}:${server.port} #make the application unique on the rancher service layer
spring:
application:
index: ${random.uuid} #make the application unique on the rancher containe layer,same service but with multi-instances.
https://github.com/spring-cloud/spring-cloud-netflix/issues/740
Need instance-id:${spring.cloud.client.ipAddress}:${server.port} and index: ${random.uuid} both