Author : MD TAREQ HASSAN | Updated : 2023/07/19
Container orchestration
- Container orchestration refers to the management and coordination of containerized applications deployed in a distributed environment (distributed system => kubernetes running across multiple nodes)
- It involves automating various tasks related to deploying, scaling, networking, and managing containers
- Container orchestration involves:
- Deployment: orchestrators automate the deployment of containerized apps
- Load balancing: orchestrators balance incoming traffic across multiple containers to ensure even distribution
- Service discovery and networking: orchestration platforms provide mechanisms for service discovery, allowing containers to communicate with each other
- Health monitoring and self-healing: orchestrators monitor the health of containers and automatically restart or replace failed containers
- Configuration management: container orchestration platforms facilitate the management of application configurations
- Rolling updates and rollbacks: orchestrators enable seamless updates of containerized applications by gradually rolling out new versions
Kubernetes 101
- Kubernetes is an open-source container orchestration platform that provides a framework for automating the deployment, scaling, and management of containerized applications
- Kubernetes is a platform for managing containerized workloads and services, that facilitates both declarative configuration and automation
- Kubernetes abstracts away complex container management and provides us with declarative configuration to orchestrate containers in different compute environments
- Kubernetes (K8s) is a container orchestration tool:
- A container orchestrator is a system that automatically deploys and manages containerized workloads
- Orchestration is the automated configuration, management and coordination of applications, and services
As Kubernetes is a container orchestrator, it needs a container runtime in order to orchestrate. Docker is mostly used container runtime. Kubernetes automates many complex tasks related to container orchestration, ensuring high availability, fault tolerance, and efficient resource utilization.
Kubernetes cluster
- A cluster is a set of computers or Virtual Machines (VM) that you configure to work together and view as a single system
- A Kubernetes cluster is a set of nodes that run containerized workloads (each node in a cluster can be either physical server of Virtual Machine)
- Kubernetes clusters are comprised of one master node and a number of worker nodes
A cluster is a set of nodes that collectively provide the computing resources and infrastructure to run and manage containerized applications using the Kubernetes platform.
Node
- A node may be a virtual or physical machine, depending on the cluster
- Kubernetes runs containerized workloads by placing containers into Pods to run on Nodes
- Each node is managed by the control plane and contains the services necessary to run Pods
Typically you have several nodes in a cluster, but you might have only one node.
Master node
- A master node is a node which controls and manages a set of worker nodes and resembles a cluster in Kubernetes
- The master node is responsible for managing and controlling the entire cluster
- Master node:
- oversees the scheduling, deployment, and monitoring of applications running on the worker nodes
- hosts essential components like the Kubernetes API server, scheduler, and controller manager
- acts as the central control point for managing the cluster’s resources and orchestrating containerized applications
- The master node controls the state of the cluster; for example, which applications are running and their corresponding container images
Worker nodes
- The worker nodes are the components that run containerized applications
- Worker nodes, also known as minion nodes, are the servers where the actual application containers run
- Worker nodes:
- responsible for executing the workloads assigned to them by the master node
- typically has a container runtime, such as Docker, installed to run the containers
- communicate with the master node to receive instructions, report status, and handle resource allocation and scheduling
- Tasks are assigned to worker nodes
Control Plane
- Kubernetes control plane is a collection of services that manage the orchestration functionality in Kubernetes
- The container orchestration layer that exposes the API and interfaces to define, deploy, and manage the lifecycle of containers
- A set of components for managing the overall cluster
Kubernetes Cluster Compnents
Kubernetes components are the software modules that make up the Kubernetes control plane. These components run on the master node and are responsible for managing and controlling the cluster.
Control Plane Components:
- API Server
- Controller
- Controller manager
- Scheduler
- Etcd
Node Components:
- Kubelet
- Kube proxy
- Container runtime
The master node runs Control Plane Components, and the worker nodes run Node Components.
API server
- Exposes the Kubernetes API: API server exposes the Kubernetes API, which serves as the primary interface for interacting with the cluster
- Exposes a REST interface to all Kubernetes resources
- Serves as the front end of the Kubernetes control plane
- All the communication between the components in Kubernetes is done through this API
API server handles API requests, validates them, and performs operations such as deploying applications or modifying cluster state
Desired State
- The state of a cluster to run target worload properly
- A desired state is defined by configuration files made up of manifests, which are JSON or YAML files that declare the type of application to run and how many replicas are required to run a healthy system
- The cluster’s desired state is defined with the Kubernetes API
Example: thermostat
When you set the temperature, that’s telling the thermostat about your desired state.
The actual room temperature is the current state. The thermostat acts to bring the current state closer to the desired state,
by turning equipment on or off.
Desired State in K8s
In Kubernetes, controllers are control loops that watch the state of your cluster,
then make or request changes where needed. Each controller tries to move the current cluster state closer to the desired state.
Controller
- In Kubernetes, a controller is a control loop that watches the shared state of the cluster through the apiserver and makes changes attempting to move the current state towards the desired state
- A controller is a key component Kubernetes responsible for managing and maintaining the desired state of various Kubernetes resources
- Controllers continuously monitor the state of the cluster and take action to reconcile any differences between the current state and the desired state specified by the user
- Controllers automate tasks such as resource creation, scaling, healing, and termination, allowing users to focus on defining the desired state rather than manually managing the cluster
There are several types of controllers in Kubernetes, each designed to manage specific resources. Here are some commonly used controllers:
- Replication Controller: ensures that a specified number of pod replicas are running at all times
- Deployment Controller: provides a declarative way to manage rolling updates and rollbacks of application deployments
- StatefulSet Controller: used for managing stateful applications. It ensures ordered and predictable deployment & scaling of stateful pods
- DaemonSet Controller: ensures that a specific pod runs on each node in the cluster
- Job Controller: manages batch workloads or one-off tasks
- CronJob Controller: allows scheduling and running jobs at specified intervals using cron-like expressions
Controller manager
- Cube Controller Manager:
- Runs controller processes and reconciles the cluster’s actual state with its desired specifications
- Manages controllers such as node controllers, endpoints controllers and replication controllers
- Cloud Controller Manager: A Kubernetes control plane component that embeds cloud-specific control logic
Scheduler
- The scheduler assigns pods to nodes based on resource requirements, constraints, and availability
- It ensures efficient utilization of resources and optimal distribution of workloads across the cluster
- Watches for newly created Pods with no assigned node, and selects a node for them to run on
- Places containers according to resource requirements and metrics. Makes note of Pods with no assigned node, and selects nodes for them to run on
etcd
- etcd is a distributed key-value store used by Kubernetes to store cluster configuration data and state information
- It provides consistent and reliable storage for Kubernetes’ internal operations
- Stores all cluster data. Consistent and highly available Kubernetes backing store
- Used as Kubernetes’ backing store for all cluster data
Kubelet
- The kubelet is the agent that runs on each node in the cluster, and monitors work requests from the API server
- Ensures that containers are running in a Pod by interacting with the Docker engine
- The default program for creating and managing containers
- Takes a set of provided PodSpecs and ensures that their corresponding containers are fully operational
Kube proxy
- Manages network connectivity and maintains network rules across nodes
- kube-proxy maintains network rules on nodes. These network rules allow network communication to your Pods from network sessions inside or outside of your cluster
- Implements the Kubernetes Service concept across every node in a given cluster
Container Runtime
- The container runtime is the software that is responsible for running containers
- Container runtime (e.g., containerd, CRI-O) is runs and manages containers on each worker node
- Kubernetes interacts with the container runtime to deploy and manage containers
Kubernates Resources
Kubernetes resources (“Kubernates Objects”) are the declarative objects that you define and manage within the cluster to represent the desired state of your applications and their associated components. Kubernates Resources are defined using manifest files (yaml or json) and are submitted to the Kubernetes API for execution.
Some of the key resources (Kubernates Objects):
- Pod
- ReplicaSet
- Deployment
- Service
- StatefulSet
- ConfigMap
- Secret
- PersistentVolume (PV) and PersistentVolumeClaim (PVC)
- Namespace
Pod
Unlike in a Docker environment, you can’t run containers directly on Kubernetes. You package the container into a Kubernetes object called a pod. Each Pod is it’s own self-contained server. When a Pod runs multiple containers, the containers are managed as a single entity and share the Pod’s resources. A single pod can hold a group of one or more containers. However, a pod typically doesn’t contain multiples of the same app.
A kubenetes pod:
- the smallest, fundamental unit of deployment in Kubernetes (most basic deployable objects in Kubernetes)
- represents one or more co-located containers (a group of one or more tightly coupled containers that share the same network namespace, storage, and scheduling specifications)
- an abstraction over container in context of Kubernetes
ReplicaSet
- ReplicaSet is a resource used to ensure that a specified number of identical pod replicas are running at all times
- A ReplicaSet ensures that a specific number of pod replicas are running and maintained
- It helps ensure high availability, fault tolerance, and scalability for applications within the Kubernetes cluster (ReplicaSet is used primarily for scaling and ensuring high availability)
- ReplicaSets are typically managed by higher-level resources like Deployments
A Replica Set’s purpose is to maintain a stable set of replica Pods running at any given time.
Deployment
- Provides declarative management for Pods and Replica Sets
- Under the hood, Deployments create and manage ReplicaSets
- Deployments own and manage their ReplicaSets
- See:
Service
- Also called “Service Object”
- Each Pod would have a service in-front of it
- Service would have IP address and Pods communicate with each other using that IP address
- See details: Kubernetes Service Component
StatefulSet
- A StatefulSet is a higher-level resource designed for managing stateful applications (i.e. MongoDB. Kafka will be deployed to Kubernates as StatefulSet rather than ReplicaSet@Deployment)
- StatefulSet ensures ordered and predictable deployment, scaling, and termination of stateful application components
ConfigMap
- A ConfigMap stores configuration data as key-value pairs and can be used to store environment variables, command-line arguments, or configuration files
- ConfigMaps decouple configuration from application code, making it easier to manage and update configurations independently
Secret
- A Secret is used for securely storing sensitive information, such as passwords, API keys, or TLS certificates
- Secrets are encoded or encrypted before being stored in etcd
- They provide a way to securely inject sensitive data into pods
PersistentVolume & PersistentVolumeClaim
- PersistentVolume (PV) represents physical storage resources, such as disks or network-attached storage, while PersistentVolumeClaim (PVC) is request made by pods for specific storage requirements.
- PVs and PVCs are used to provide persistent storage for applications.
- PVCs bind to available PVs, providing applications with persistent storage even if pods are restarted or rescheduled
Namespace
- Kubernetes supports multiple virtual clusters backed by the same physical cluster. These virtual clusters are called namespaces
- A K8s namespace is a logical clusters within physical cluster
Endpoint
- An endpoint in Kubernetes can be considered as an object-oriented representation of a REST API endpoint that is populated on the Kubernates API server. Thus, the ‘endpoint’ in terms of Kubernetes is the way to access its resource (e.g. a Pod) - the resource behind the ‘endpoint’
- In other words, an endpoint is a socket (IP + Port) used by service to access/communicate to Pod
- Pods expose themselves through endpoints to a service
- See: https://stackoverflow.com/q/52857825/4802664
Dashboard
- Dashboard is a general purpose, web-based UI for Kubernetes clusters
- It allows users to manage and troubleshoot applications running in the cluster, as well as the cluster itself
kubectl
- Kubernetes provides a command-line tool called kubectl to manage your cluster
- You use kubectl to send commands to the cluster’s control plane, or fetch information about all Kubernetes objects via the API server
Manifest file
- A declarative file that is use to create Kubernetes resource
- Manifest file:
- Blueprint for creating Pods
- Declarative
- Used to achieve desired state
- It can be yaml or json