Kubernetes: Protect Resource Consumption with Limits and Health Checks

The Big Picture

  1. It provides the required functionality
  2. It can consume the required amount of CPU, memory and storage
  3. It does not overconsume CPU, memory or storage
  1. Readiness and Liveness
  2. The requests resources
  3. The resource limits

Defining Health Checks

  • Liveness: This check is about the pod being operational, meaning it provides the functionality and endpoints that are required. It specifically also means that the pod is not in a state of deadlock. If the liveness check fails, the pod will be terminated and replaced by a new one.
  • Startup: This check complements liveness checks for pods which have a considerable startup time of some seconds up to minutes. It checks that the startup process is still ongoing. If the startup check fails, the pod will be terminated and replaced by a new one. When a pod has a liveness and startup check, the liveness check will be disabled until startup is complete.
  • Readiness: This check ensures that a pod can receive requests. There are cases where a pod can only work on a limited number of items at the same time, e.g. generating reports. The readiness check determines if the pod can receive another request. If this check fails, the pod will be removed from the ingress for the time being. Once its read again, it will be added to the ingress.
  • exec: Run a script on the pod. Only the exit value of 0 will be considered as a successful check.
  • httpGet: Execute a HTTP request. HTTP status codes in the range of 200 - 300 are considered successful
  • tcpSocket: Connect to a TCP socket on the host. If the connection can be made, the check is successful

HTTP Health Check

apiVersion: apps/v1
kind: Deployment
metadata:
name: lighthouse
spec:
replicas: 4
selector:
matchLabels:
app: lighthouse
template:
spec:
containers:
- name: lighthouse
image: docker.admantium.com/lighthouse:0.1.9
livenessProbe:
httpGet:
path: /liveness
port: 8080
periodSeconds: 10
readinessProbe:
httpGet:
path: /readiness
port: 8080
periodSeconds: 10
failureThreshold: 6

Exec Health Check

apiVersion: apps/v1
kind: Deployment
metadata:
name: lighthouse
spec:
replicas: 1
selector:
matchLabels:
app: lighthouse
template:
spec:
containers:
- name: lighthouse
image: docker.admantium.com/lighthouse:0.1.9
livenessProbe:
exec:
command:
- /etc/scanner/health_check.sh
periodSeconds: 10
failureThreshold: 2
> kb describe pod/lighthouse-575fd6fdd9-7fptsEvents:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/lighthouse-6bbdb45f9d-6m8hn to k3s-node1
Normal Pulling 2m9s kubelet, k3s-node1 Pulling image "docker.admantium.com/lighthouse:0.1.9.7"
Normal Pulled 114s kubelet, k3s-node1 Successfully pulled image "docker.admantium.com/lighthouse:0.1.9.7"
Normal Created 114s kubelet, k3s-node1 Created container lighthouse
Normal Started 113s kubelet, k3s-node1 Started container lighthouse
Warning Unhealthy 2s (x2 over 42s) kubelet, k3s-node1 Liveness probe failed: Scan daemon is not running.

Health Check Configuration

  • initialDelaySeconds Delay in seconds before the first test
  • periodSeconds Period in seconds between checks
  • timeoutSeconds Time in seconds before the probe is considered failed
  • successThreshold After having failed, minimum number of times before the probe is considered successful
  • failureThreshold Number of times the check can fail before Pod is marked unhealthy

Controlling Compute Resources

  • CPU: The number of CPU cores, expressed in decimal with a fraction, like 0.4or 1.0, or as the number of CPU milliseconds a container can use like 400m and 1000m
  • Memory: The amount of bytes, expressed as integers with the units Kilo, Mega, Giga, Tera, Peta and Exa (note that Kubernetes prefers to use Mebibyte instead of Megabytes, because the former is based on real multiples of 1024)
apiVersion: apps/v1
kind: Deployment
metadata:
name: lighthouse
spec:
replicas: 4
selector:
matchLabels:
app: lighthouse
template:
spec:
containers:
- name: lighthouse
image: docker.admantium.com/lighthouse:0.1.9
livenessProbe:
httpGet:
path: /liveness
port: 8080
periodSeconds: 10
readinessProbe:
httpGet:
path: /readiness
port: 8080
periodSeconds: 10
failureThreshold: 6
resources:
requests:
memory: 256M
cpu: 0.33
limits:
memory: 512M
cpu: 0.75

Conclusion

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store