Author : MD TAREQ HASSAN | Updated : 2021/07/05
Types
- Horizontal Pod Auto-scalling (HPA)
- Cluster Auto Scalling
Horizontal Pod Auto Scalling
- Kubernetes supports horizontal pod autoscaling to adjust the number of pods in a deployment depending on CPU utilization or other select metrics
- The Metrics Server is used to provide resource utilization to Kubernetes, and is automatically deployed in AKS
- To use the autoscaler, all containers in your pods and your pods must have CPU requests and limits defined
- See: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
Resource requests and limits for pod container
- The CPU resource is measured in CPU units. One CPU, in Kubernetes, is equivalent to “1 Azure vCore”
- https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#resource-requests-and-limits-of-pod-and-container
resources:
requests:
cpu: 250m
memory: "64Mi"
limits:
cpu: 500m
memory: "128Mi"
CPU usage examples
apiVersion: apps/v1
kind: Deployment
metadata:
name: foo-api
namespace: demo
spec:
# replicas: 3
selector:
matchLabels:
app: foo-api
template:
metadata:
labels:
app: foo-api
spec:
containers:
- name: foo-api
image: myacr.azurecr.io/foo:20210705
ports:
- name: http
containerPort: 80
protocol: TCP
resources:
limits:
cpu: 250m
requests:
cpu: 500m
Auto scaller (examples are below)
- Using yaml manifest file (i.e.
foo-hpa.yaml
) - Using kubectl command
foo-hpa.yaml
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: abc-hpa
spec:
maxReplicas: 10 # define max replica count
minReplicas: 3 # define min replica count
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: abc-deployment
targetCPUUtilizationPercentage: 50 # target CPU utilization
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: xyz-hpa
spec:
maxReplicas: 10 # define max replica count
minReplicas: 3 # define min replica count
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: xyz-deployment
targetCPUUtilizationPercentage: 50 # target CPU utilization
Kubectl command
kubectl autoscale deployment azure-vote-front --cpu-percent=50 --min=3 --max=10
Cluster Auto Scaller
- AKS has ability to automatically scale up or down the number of nodes
- The cluster autoscaler component can watch for pods in your cluster that can’t be scheduled because of resource constraints
- When issues are detected, the number of nodes in a node pool is increased to meet the application demand
Links
- https://docs.microsoft.com/en-us/azure/aks/tutorial-kubernetes-scale?tabs=azure-cli#autoscale-pods
- https://docs.microsoft.com/en-us/azure/aks/cluster-autoscaler
- https://docs.microsoft.com/en-us/azure/aks/cluster-autoscaler#using-the-autoscaler-profile
- https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/#cpu-units
- Resource requests and limits of Pod and Container
- Configure Default CPU Requests and Limits for a Namespace
- Configure Minimum and Maximum CPU Constraints for a Namespace