Kubernetes Concepts

Author : MD TAREQ HASSAN | Updated : 2023/07/19

Container orchestration

Container orchestration refers to the management and coordination of containerized applications deployed in a distributed environment (distributed system => kubernetes running across multiple nodes)
It involves automating various tasks related to deploying, scaling, networking, and managing containers
Container orchestration involves:
- Deployment: orchestrators automate the deployment of containerized apps
- Load balancing: orchestrators balance incoming traffic across multiple containers to ensure even distribution
- Service discovery and networking: orchestration platforms provide mechanisms for service discovery, allowing containers to communicate with each other
- Health monitoring and self-healing: orchestrators monitor the health of containers and automatically restart or replace failed containers
- Configuration management: container orchestration platforms facilitate the management of application configurations
- Rolling updates and rollbacks: orchestrators enable seamless updates of containerized applications by gradually rolling out new versions

Kubernetes 101

Kubernetes is an open-source container orchestration platform that provides a framework for automating the deployment, scaling, and management of containerized applications
Kubernetes is a platform for managing containerized workloads and services, that facilitates both declarative configuration and automation
Kubernetes abstracts away complex container management and provides us with declarative configuration to orchestrate containers in different compute environments
Kubernetes (K8s) is a container orchestration tool:
- A container orchestrator is a system that automatically deploys and manages containerized workloads
- Orchestration is the automated configuration, management and coordination of applications, and services

As Kubernetes is a container orchestrator, it needs a container runtime in order to orchestrate. Docker is mostly used container runtime. Kubernetes automates many complex tasks related to container orchestration, ensuring high availability, fault tolerance, and efficient resource utilization.

Kubernetes cluster

A cluster is a set of computers or Virtual Machines (VM) that you configure to work together and view as a single system
A Kubernetes cluster is a set of nodes that run containerized workloads (each node in a cluster can be either physical server of Virtual Machine)
Kubernetes clusters are comprised of one master node and a number of worker nodes

A cluster is a set of nodes that collectively provide the computing resources and infrastructure to run and manage containerized applications using the Kubernetes platform.

Node

A node may be a virtual or physical machine, depending on the cluster
Kubernetes runs containerized workloads by placing containers into Pods to run on Nodes
Each node is managed by the control plane and contains the services necessary to run Pods

Typically you have several nodes in a cluster, but you might have only one node.

Master node

A master node is a node which controls and manages a set of worker nodes and resembles a cluster in Kubernetes
The master node is responsible for managing and controlling the entire cluster
Master node:
- oversees the scheduling, deployment, and monitoring of applications running on the worker nodes
- hosts essential components like the Kubernetes API server, scheduler, and controller manager
- acts as the central control point for managing the cluster’s resources and orchestrating containerized applications
The master node controls the state of the cluster; for example, which applications are running and their corresponding container images

Worker nodes

The worker nodes are the components that run containerized applications
Worker nodes, also known as minion nodes, are the servers where the actual application containers run
Worker nodes:
- responsible for executing the workloads assigned to them by the master node
- typically has a container runtime, such as Docker, installed to run the containers
- communicate with the master node to receive instructions, report status, and handle resource allocation and scheduling
Tasks are assigned to worker nodes

Control Plane

Kubernetes control plane is a collection of services that manage the orchestration functionality in Kubernetes
The container orchestration layer that exposes the API and interfaces to define, deploy, and manage the lifecycle of containers
A set of components for managing the overall cluster

Kubernetes Cluster Compnents

Kubernetes components are the software modules that make up the Kubernetes control plane. These components run on the master node and are responsible for managing and controlling the cluster.

Control Plane Components:

API Server
Controller
Controller manager
Scheduler
Etcd

Node Components:

Kubelet
Kube proxy
Container runtime

The master node runs Control Plane Components, and the worker nodes run Node Components.

API server

Exposes the Kubernetes API: API server exposes the Kubernetes API, which serves as the primary interface for interacting with the cluster
Exposes a REST interface to all Kubernetes resources
Serves as the front end of the Kubernetes control plane
All the communication between the components in Kubernetes is done through this API

API server handles API requests, validates them, and performs operations such as deploying applications or modifying cluster state

Desired State

The state of a cluster to run target worload properly
A desired state is defined by configuration files made up of manifests, which are JSON or YAML files that declare the type of application to run and how many replicas are required to run a healthy system
The cluster’s desired state is defined with the Kubernetes API

Example: thermostat
When you set the temperature, that’s telling the thermostat about your desired state. The actual room temperature is the current state. The thermostat acts to bring the current state closer to the desired state, by turning equipment on or off.

Desired State in K8s
In Kubernetes, controllers are control loops that watch the state of your cluster, then make or request changes where needed. Each controller tries to move the current cluster state closer to the desired state.

Controller

In Kubernetes, a controller is a control loop that watches the shared state of the cluster through the apiserver and makes changes attempting to move the current state towards the desired state
A controller is a key component Kubernetes responsible for managing and maintaining the desired state of various Kubernetes resources
Controllers continuously monitor the state of the cluster and take action to reconcile any differences between the current state and the desired state specified by the user
Controllers automate tasks such as resource creation, scaling, healing, and termination, allowing users to focus on defining the desired state rather than manually managing the cluster

There are several types of controllers in Kubernetes, each designed to manage specific resources. Here are some commonly used controllers:

Replication Controller: ensures that a specified number of pod replicas are running at all times
Deployment Controller: provides a declarative way to manage rolling updates and rollbacks of application deployments
StatefulSet Controller: used for managing stateful applications. It ensures ordered and predictable deployment & scaling of stateful pods
DaemonSet Controller: ensures that a specific pod runs on each node in the cluster
Job Controller: manages batch workloads or one-off tasks
CronJob Controller: allows scheduling and running jobs at specified intervals using cron-like expressions

Controller manager

Cube Controller Manager:
- Runs controller processes and reconciles the cluster’s actual state with its desired specifications
- Manages controllers such as node controllers, endpoints controllers and replication controllers
Cloud Controller Manager: A Kubernetes control plane component that embeds cloud-specific control logic

Scheduler

The scheduler assigns pods to nodes based on resource requirements, constraints, and availability
It ensures efficient utilization of resources and optimal distribution of workloads across the cluster
Watches for newly created Pods with no assigned node, and selects a node for them to run on
Places containers according to resource requirements and metrics. Makes note of Pods with no assigned node, and selects nodes for them to run on

etcd

etcd is a distributed key-value store used by Kubernetes to store cluster configuration data and state information
It provides consistent and reliable storage for Kubernetes’ internal operations
Stores all cluster data. Consistent and highly available Kubernetes backing store
Used as Kubernetes’ backing store for all cluster data

Kubelet

The kubelet is the agent that runs on each node in the cluster, and monitors work requests from the API server
Ensures that containers are running in a Pod by interacting with the Docker engine
The default program for creating and managing containers
Takes a set of provided PodSpecs and ensures that their corresponding containers are fully operational

Kube proxy

Manages network connectivity and maintains network rules across nodes
kube-proxy maintains network rules on nodes. These network rules allow network communication to your Pods from network sessions inside or outside of your cluster
Implements the Kubernetes Service concept across every node in a given cluster

Container Runtime

The container runtime is the software that is responsible for running containers
Container runtime (e.g., containerd, CRI-O) is runs and manages containers on each worker node
Kubernetes interacts with the container runtime to deploy and manage containers

Kubernates Resources

Kubernetes resources (“Kubernates Objects”) are the declarative objects that you define and manage within the cluster to represent the desired state of your applications and their associated components. Kubernates Resources are defined using manifest files (yaml or json) and are submitted to the Kubernetes API for execution.

Some of the key resources (Kubernates Objects):

Pod
ReplicaSet
Deployment
Service
StatefulSet
ConfigMap
Secret
PersistentVolume (PV) and PersistentVolumeClaim (PVC)
Namespace

Pod

Unlike in a Docker environment, you can’t run containers directly on Kubernetes. You package the container into a Kubernetes object called a pod. Each Pod is it’s own self-contained server. When a Pod runs multiple containers, the containers are managed as a single entity and share the Pod’s resources. A single pod can hold a group of one or more containers. However, a pod typically doesn’t contain multiples of the same app.

A kubenetes pod:

the smallest, fundamental unit of deployment in Kubernetes (most basic deployable objects in Kubernetes)
represents one or more co-located containers (a group of one or more tightly coupled containers that share the same network namespace, storage, and scheduling specifications)
an abstraction over container in context of Kubernetes

ReplicaSet

ReplicaSet is a resource used to ensure that a specified number of identical pod replicas are running at all times
A ReplicaSet ensures that a specific number of pod replicas are running and maintained
It helps ensure high availability, fault tolerance, and scalability for applications within the Kubernetes cluster (ReplicaSet is used primarily for scaling and ensuring high availability)
ReplicaSets are typically managed by higher-level resources like Deployments

A Replica Set’s purpose is to maintain a stable set of replica Pods running at any given time.

Deployment

Provides declarative management for Pods and Replica Sets
Under the hood, Deployments create and manage ReplicaSets
Deployments own and manage their ReplicaSets
See:
- https://kubernetes.io/docs/concepts/workloads/controllers/deployment/
- https://www.redhat.com/en/topics/containers/what-is-kubernetes-deployment

Service

Also called “Service Object”
Each Pod would have a service in-front of it
Service would have IP address and Pods communicate with each other using that IP address
See details: Kubernetes Service Component

StatefulSet

A StatefulSet is a higher-level resource designed for managing stateful applications (i.e. MongoDB. Kafka will be deployed to Kubernates as StatefulSet rather than ReplicaSet@Deployment)
StatefulSet ensures ordered and predictable deployment, scaling, and termination of stateful application components

ConfigMap

A ConfigMap stores configuration data as key-value pairs and can be used to store environment variables, command-line arguments, or configuration files
ConfigMaps decouple configuration from application code, making it easier to manage and update configurations independently

Secret

A Secret is used for securely storing sensitive information, such as passwords, API keys, or TLS certificates
Secrets are encoded or encrypted before being stored in etcd
They provide a way to securely inject sensitive data into pods

PersistentVolume & PersistentVolumeClaim

PersistentVolume (PV) represents physical storage resources, such as disks or network-attached storage, while PersistentVolumeClaim (PVC) is request made by pods for specific storage requirements.
PVs and PVCs are used to provide persistent storage for applications.
PVCs bind to available PVs, providing applications with persistent storage even if pods are restarted or rescheduled

Namespace

Kubernetes supports multiple virtual clusters backed by the same physical cluster. These virtual clusters are called namespaces
A K8s namespace is a logical clusters within physical cluster

Endpoint

An endpoint in Kubernetes can be considered as an object-oriented representation of a REST API endpoint that is populated on the Kubernates API server. Thus, the ‘endpoint’ in terms of Kubernetes is the way to access its resource (e.g. a Pod) - the resource behind the ‘endpoint’
In other words, an endpoint is a socket (IP + Port) used by service to access/communicate to Pod
Pods expose themselves through endpoints to a service
See: https://stackoverflow.com/q/52857825/4802664

Dashboard

Dashboard is a general purpose, web-based UI for Kubernetes clusters
It allows users to manage and troubleshoot applications running in the cluster, as well as the cluster itself

kubectl

Kubernetes provides a command-line tool called kubectl to manage your cluster
You use kubectl to send commands to the cluster’s control plane, or fetch information about all Kubernetes objects via the API server

Manifest file

A declarative file that is use to create Kubernetes resource
Manifest file:
- Blueprint for creating Pods
- Declarative
- Used to achieve desired state
It can be yaml or json

Details: Understanding Kubernetes Manifest File