Azure Kubernetes Service

Author : MD TAREQ HASSAN | Updated : 2021/07/05

Types

Horizontal Pod Auto-scalling (HPA)
Cluster Auto Scalling

Horizontal Pod Auto Scalling

Kubernetes supports horizontal pod autoscaling to adjust the number of pods in a deployment depending on CPU utilization or other select metrics
The Metrics Server is used to provide resource utilization to Kubernetes, and is automatically deployed in AKS
To use the autoscaler, all containers in your pods and your pods must have CPU requests and limits defined
See: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

Resource requests and limits for pod container

The CPU resource is measured in CPU units. One CPU, in Kubernetes, is equivalent to “1 Azure vCore”
https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#resource-requests-and-limits-of-pod-and-container

resources:
  requests:
    cpu: 250m
    memory: "64Mi"
  limits:
    cpu: 500m
    memory: "128Mi"

CPU usage examples

apiVersion: apps/v1
kind: Deployment
metadata:
  name: foo-api
  namespace: demo
spec:
#  replicas: 3
  selector:
    matchLabels:
      app: foo-api
  template:
    metadata:
      labels:
        app: foo-api
    spec:
      containers:
      - name: foo-api
        image: myacr.azurecr.io/foo:20210705
        ports:
        - name: http
          containerPort: 80
          protocol: TCP
        resources:
          limits:
            cpu: 250m
          requests:
            cpu: 500m

Auto scaller (examples are below)

Using yaml manifest file (i.e. foo-hpa.yaml)
Using kubectl command

foo-hpa.yaml

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: abc-hpa
spec:
  maxReplicas: 10 # define max replica count
  minReplicas: 3  # define min replica count
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: abc-deployment
  targetCPUUtilizationPercentage: 50 # target CPU utilization

---

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: xyz-hpa
spec:
  maxReplicas: 10 # define max replica count
  minReplicas: 3  # define min replica count
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: xyz-deployment
  targetCPUUtilizationPercentage: 50 # target CPU utilization

Kubectl command

kubectl autoscale deployment azure-vote-front --cpu-percent=50 --min=3 --max=10

Cluster Auto Scaller

AKS has ability to automatically scale up or down the number of nodes
The cluster autoscaler component can watch for pods in your cluster that can’t be scheduled because of resource constraints
When issues are detected, the number of nodes in a node pool is increased to meet the application demand

Author : MD TAREQ HASSAN | Updated : 2021/07/05

Types

Horizontal Pod Auto Scalling

Cluster Auto Scaller

Links