Kubernetes 101: Part 1 - Kubernetes Architecture

Kubernetes is an open-source platform that simplifies the deployment, scaling, and management of containerized applications. Think of it as a skilled conductor leading a symphony of containers, making sure they all perform in perfect harmony.

Kubernetes Cluster Overview

The Kubernetes Cluster consists of at least 2 nodes at the fundamental level: the control plane (master node) and the worker node(s).

The control plane as the name suggests controls and manages the working of the entire cluster and to do so the control plane has some components which include:

Kube-api-server: The kube-api-server configures, validates, and performs the actions inside the kubernetes cluster. Every request within or outside the cluster goes through the api-server before being carried out. Consider an example of a hotel where a manager needs to be informed of everything that is going on within the hotel and also he/she needs to attend to new guests (in this case requests using kubectl).
Controller-manager: The controller manager is a crucial part of Kubernetes that keeps the cluster's actual state in line with its desired state. It operates control loops, continuously checking for changes and adjusting via the API server.

Its main duties are:
- Ensuring cluster resources remain in their desired state
- Running various controllers for tasks like replication, managing endpoints, and handling namespaces
- Communicating with the API server to oversee and update the cluster's state

Following up with the previous example of the hotel, the controller manager can resemble the head chef of the hotel who is responsible for taking care of the guests' dining needs. He/She should look for the current state and the desired state of the order and all the other chefs and workers under him/her are the controllers tasked with preparing, garnishing, cleaning, etc.

Kube-scheduler: The scheduler is a component of the Kubernetes control plane that assigns Pods to Nodes. It optimizes pod placement by considering factors such as resource availability, node affinity, and workload constraints. The kube-scheduler can be recognized as a supervisor who has knowledge of the number of empty rooms available in the hotel and which one is the best suited for that particular guest.
Etcd: Etcd is a distributed key-value store that acts as the brain of Kubernetes. It stores all the cluster data, including desired and actual states, configuration, and runtime information. Essentially, it serves as the point of reference for the entire cluster. Lastly, the, etc can be considered as the accountant of the hotel who keeps track of every material and records of every guest and staff of the hotel.

The worker node within the same cluster or a different cluster that is responsible for hosting containerized pods has 2 components namely:

Kubelet: An agent runs on each worker node, managing the lifecycle of containers within Pods. It communicates with the Kubernetes control plane to receive instructions and report node status, ensuring that containers are running as specified in their Pod definitions. Following the hotel analogy, kubelet could be the room service that is responsible for taking care of the guests (pods).
Kube-proxy: A network proxy runs on each worker node, implementing Kubernetes Service concepts by maintaining network rules. It enables communication between Pods and external services and provides load balancing across multiple Pods for a Service.

The Kubernetes Workflow

So, now let's see what happens when we execute "kubectl create -f <pod-definition-file>".

Firstly, the request is authenticated and validated.
The kube-api-server writes a record in the etcd database that a pod needs to be created without specifying the node as it has not been decided yet.
The kube-scheduler continuously monitors the kube-api-server and notices that a pod needs to be created. Taking into context all the parameters, the scheduler selects a node most suitable for the particular pod.
The kube-scheduler then reports to the kube-api-server and informs about the selected node. The kube-api-server then updates the record of the pod placement in the etcd database.
Next, the api-server informs the kubelet of that particular node to schedule a pod. Once, it is done, the kubelet reports back the status of the pod to the api-server which is then updated to the etcd database.