Introduction

Imagine you’re a Platform Engineer managing a production Azure Kubernetes Service (AKS) cluster.

A new application release has just been deployed.

The DevOps pipeline completes successfully and Kubernetes creates the Pods.

However, a few minutes later, alerts start firing.

Customers report that the application is unavailable.

You quickly run:

kubectl get pods

And see:

NAME                                READY   STATUS              RESTARTS
customer-portal-7c7f8c9f8d-jd2m8   0/1     ContainerCreating   0
customer-portal-7c7f8c9f8d-xn4lp   0/1     ContainerCreating   0
customer-portal-7c7f8c9f8d-qw7bk   0/1     ContainerCreating   0

The Deployment exists.

The Scheduler has assigned the Pods to worker nodes.

The nodes are healthy.

Yet the application never starts.

At this point, many engineers begin troubleshooting:

Deployments
ReplicaSets
Services
Ingress

However, the real issue is often much deeper.

The problem usually exists in the layer responsible for actually running containers.

That layer is the Container Runtime.

To understand why these Pods are stuck, Most Kubernetes engineers spend their time working with Pods, Deployments, Services, and Ingress resources. However, very few understand what happens after Kubernetes schedules a Pod onto a worker node. we first need to understand what happens after Kubernetes schedules a Pod.

When you execute:

kubectl apply -f deployment.yaml

you can see Pods becoming Running within seconds. Yet behind the scenes, Kubernetes performs dozens of operations before a container actually starts.

Many engineers assume Kubernetes itself runs containers.

That is not true.

Kubernetes orchestrates containers, but it does not run them directly.

Instead, Kubernetes relies on a specialized component known as a Container Runtime.

The container runtime is the hidden engine that turns Kubernetes manifests into running workloads.

Without a container runtime:

Images cannot be pulled
Containers cannot start
Pods cannot run
Applications cannot execute

In this guide, you’ll learn how container runtimes work, why Kubernetes introduced the Container Runtime Interface (CRI), why Docker was removed as the default runtime, and how containerd and CRI-O actually start containers on Linux systems.

By the end of this article, you’ll understand one of the most important layers of Kubernetes that many engineers never explore.

🔗Following the Kubernetes Learning Roadmap?

This article is part of my comprehensive Kubernetes learning series designed to take you from Kubernetes fundamentals to advanced platform engineering concepts.

Start here: https://geekymukesh.com/kubernetes-in-2026-the-ultimate-8-week-learning-roadmap/

Roadmap Progress:

✅ Week 1: Core Kubernetes Objects

✅ Week 2: Controllers & Desired State

✅ Week 3: Kubernetes Architecture

🚀 Week 4: Container Runtime & CRI (Current Article)

⏳ Week 5: Security & Policy

⏳ Week 6: Observability

⏳ Week 7: Extensibility & Platform Engineering

⏳ Week 8: Webhooks & Advanced Control

Why Container Runtimes Matter

Before diving into CRI and containerd, let’s answer an important question:

Why should a Kubernetes engineer care about container runtimes?

Because many production problems occur below the Kubernetes layer.

For example:

Pods stuck in ContainerCreating
ImagePullBackOff errors
CrashLoopBackOff failures
Runtime communication failures
Node-level container issues

Most engineers immediately start troubleshooting Deployments or Services.

However, the root cause often exists in the container runtime itself.

Understanding this layer allows you to:

Troubleshoot faster
Understand Kubernetes internals
Design production-ready clusters
Pass Kubernetes certification exams
Become a stronger platform engineer

More importantly, it helps you understand what actually happens when Kubernetes launches workloads.

The Journey of a Pod

In Week 3, we explored Kubernetes architecture and learned how the Control Plane communicates with worker nodes.

Now let’s focus on what happens after scheduling.

Suppose we deploy a simple application:

kubectl apply -f deployment.yaml

The process looks like this:

User
 ↓
kubectl
 ↓
API Server
 ↓
etcd
 ↓
Controller Manager
 ↓
Scheduler
 ↓
Worker Node
 ↓
kubelet
 ↓
Container Runtime
 ↓
Linux Kernel
 ↓
Container Running

Most tutorials stop at:

Scheduler → Node

However, the most interesting work begins after the Scheduler selects a node.

This runtime layer is often invisible to engineers until something breaks.

When Pods become stuck in:

ContainerCreating
ImagePullBackOff
CrashLoopBackOff

the runtime is usually involved.

Understanding this layer is therefore critical for anyone operating Kubernetes in production.

This is where kubelet and the container runtime take over.

Let’s Reproduce the Scenario

Before learning the theory, let’s create a workload and observe how Kubernetes behaves.

Create a Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: runtime-demo
spec:
  replicas: 3
  selector:
    matchLabels:
      app: runtime-demo
  template:
    metadata:
      labels:
        app: runtime-demo
    spec:
      containers:
      - name: nginx
        image: nginx:latest

Deploy it:

kubectl apply -f deployment.yaml

Watch the Pods:

kubectl get pods -w

You will see Pods transition through several states before becoming Running.

At this point, an important question arises:

Which component actually pulled the image and started the container?

The answer is not Kubernetes itself.

The answer is the Container Runtime.

Let’s understand what that means.

What You Will Learn In This Guide

By the end of this article, you’ll understand:

What a Container Runtime is
Why Kubernetes introduced the Container Runtime Interface (CRI)
Why Docker was removed from Kubernetes
How containerd works internally
How containers are created using Linux namespaces and cgroups
How to troubleshoot ImagePullBackOff and CrashLoopBackOff errors
How AKS runs workloads using containerd
Enterprise best practices for managing container runtimes

Now that we’ve seen a real-world problem, let’s dive deeper into the hidden engine that powers every Kubernetes workload.

What Is a Container Runtime?

A container runtime is software responsible for running containers on a host operating system.

Think of Kubernetes as a manager.

The manager decides:

What should run
How many replicas should exist
Which node should host the workload

However, the manager does not perform the actual work.

The container runtime performs the work.

It is responsible for:

Pulling images
Creating containers
Starting processes
Managing namespaces
Applying resource limits
Handling container lifecycle events

In simple terms:

A container runtime is the software that actually launches and manages containers.

Without it, Kubernetes would have no way to execute workloads.

Responsibilities of a Container Runtime

The runtime performs several critical tasks.

Image Management

Before a container can start, Kubernetes must obtain the required image.

Example:

image: nginx:latest

The runtime:

Contacts the registry
Downloads image layers
Stores them locally

Supported registries include:

Docker Hub
Azure Container Registry
Amazon ECR
Google Artifact Registry
Private registries

Without successful image retrieval, container startup cannot continue.

Container Creation

Once the image exists locally, the runtime creates the container.

This includes:

Creating namespaces
Mounting filesystems
Configuring networking
Preparing isolation boundaries

At this stage, the container exists but is not yet running.

Container Startup

Next, the runtime starts the process inside the container.

For example:

image: nginx

The runtime launches:

nginx

inside an isolated environment.

The container is now active.

Resource Isolation

The runtime enforces:

CPU limits
Memory limits
Process isolation
Network isolation

These controls prevent one workload from impacting others.

This isolation is achieved using Linux kernel features, which we will explore later in this guide.

The Evolution of Container Runtimes

To understand why CRI exists, we must first understand how Kubernetes originally worked.

When Kubernetes first became popular, Docker dominated the container ecosystem.

Because of this, Kubernetes initially communicated directly with Docker.

The architecture looked like this:

kubelet
   ↓
 Docker
   ↓
Containers

This design worked well in the early days.

However, over time several challenges emerged.

Docker was designed as:

A developer platform
An image building tool
A container packaging solution

Kubernetes only needed:

Container execution

This created unnecessary complexity.

The Kubernetes community needed a more flexible approach.

This eventually led to one of the most important architectural changes in Kubernetes history: the Container Runtime Interface (CRI).

Docker Shim

When Kubernetes first became popular, Docker was the dominant container platform. As a result, Kubernetes initially communicated directly with the Docker Engine to run containers.

The architecture looked like this:

kubelet
   ↓
Docker Engine
   ↓
Containers

At first, this approach worked well because Docker provided everything Kubernetes needed to run containers. However, Docker was designed as a complete container platform rather than a simple container runtime.

Docker included:

Image building capabilities
Image management
Container lifecycle management
Registry integrations
CLI tooling
Developer-focused workflows

Kubernetes only required one small part of Docker:

Container Execution

As Kubernetes adoption increased, other runtimes such as CRI-O and containerd emerged. The Kubernetes community wanted a standardized way to communicate with any runtime instead of being tightly coupled to Docker.

To solve this problem, Kubernetes introduced a component called:

Dockershim

The architecture became:

kubelet
   ↓
Dockershim
   ↓
Docker Engine
   ↓
Containers

Dockershim acted as a translation layer between kubelet and Docker.

While this improved compatibility, it also introduced additional complexity, maintenance overhead, and performance costs.

Eventually, the Kubernetes project deprecated and removed Dockershim in Kubernetes v1.24.

This marked a major shift toward runtime standardization.

What Is CRI? (Container Runtime Interface)

The Container Runtime Interface (CRI) is one of the most important architectural improvements in Kubernetes.

CRI is a standardized API that allows kubelet to communicate with any compliant container runtime.

Instead of Kubernetes supporting individual runtimes directly, Kubernetes now communicates through a common interface.

Think of CRI as a universal language.

Without CRI:

kubelet
   ↓
Docker-specific code

With CRI:

kubelet
   ↓
CRI
   ↓
Any Runtime

This abstraction allows Kubernetes to support multiple runtimes without modifying kubelet itself.

Examples of CRI-compliant runtimes include:

containerd
CRI-O
Mirantis Container Runtime

This design significantly improved flexibility and maintainability.

Why Kubernetes Needed CRI

Before CRI existed, supporting multiple runtimes required Kubernetes developers to maintain runtime-specific integrations.

This created several challenges:

Increased complexity
Higher maintenance costs
Slower innovation
Runtime lock-in

CRI solved these issues by introducing a standard communication layer.

Benefits include:

Runtime Independence

Kubernetes can work with multiple runtimes.

Simplified Maintenance

The Kubernetes project no longer maintains runtime-specific code.

Faster Innovation

Runtime vendors can innovate independently while remaining compatible with Kubernetes.

Better Stability

A standardized interface reduces integration issues.

This is one of the reasons modern Kubernetes is more modular than earlier versions.

How CRI Works

CRI exposes two primary services.

Runtime Service

Responsible for:

Creating containers
Starting containers
Stopping containers
Removing containers
Managing Pod sandboxes

Image Service

Responsible for:

Pulling images
Listing images
Removing images

The architecture looks like this:

kubelet
   ↓
Container Runtime Interface (CRI)
   ↓
┌──────────────────────┐
│ Runtime Service      │
│ Image Service        │
└──────────────────────┘
   ↓
containerd

Whenever kubelet needs to start a workload, it sends requests through CRI rather than communicating directly with the runtime.

Complete CRI Request Flow

Let’s examine what happens when a Pod is scheduled.

Step 1:

Scheduler assigns a Pod to a node.

Pod Assigned

Step 2:

kubelet detects the assignment.

New Pod Available

Step 3:

kubelet sends a CRI request.

Create Pod Sandbox

Step 4:

containerd receives the request.

Step 5:

containerd pulls required images.

Step 6:

containerd creates containers.

Step 7:

containerd starts processes.

Step 8:

kubelet receives status updates.

Step 9:

API Server reflects Pod state.

The workflow looks like:

API Server
    ↓
Scheduler
    ↓
kubelet
    ↓
CRI
    ↓
containerd
    ↓
Linux Kernel
    ↓
Running Container

This entire process typically completes within seconds.

containerd Architecture Deep Dive

Today, containerd is the most widely used Kubernetes runtime.

Major managed Kubernetes services such as AKS, EKS, and GKE use containerd by default.

containerd focuses exclusively on container execution.

Unlike Docker, it does not attempt to be a complete developer platform.

The architecture looks like:

containerd
     ↓
containerd-shim
     ↓
runc
     ↓
Linux Kernel

Each component has a specific role.

containerd

Responsible for:

Image management
Container lifecycle
Runtime coordination

containerd-shim

Acts as an intermediary between containerd and running containers.

Benefits:

Containers continue running if containerd restarts.
Better process isolation.

runc

Low-level OCI runtime.

Responsible for:

Creating namespaces
Configuring cgroups
Starting container processes

Linux Kernel

Provides the actual isolation mechanisms.

CRI-O Architecture

CRI-O was created specifically for Kubernetes.

Unlike Docker, CRI-O focuses entirely on Kubernetes workloads.

Architecture:

kubelet
   ↓
CRI
   ↓
CRI-O
   ↓
runc
   ↓
Linux Kernel

CRI-O offers:

Lightweight architecture
Kubernetes-native design
Strong security focus

It is heavily used in environments such as OpenShift.

containerd vs CRI-O

Feature	containerd	CRI-O
Popularity	Very High	High
AKS	Yes	No
EKS	Yes	No
GKE	Yes	No
OpenShift	Limited	Primary
Kubernetes Native	Yes	Yes
Ecosystem Support	Extensive	Strong

For most engineers, containerd is the runtime they will encounter most often.

OCI Standards Explained

One major challenge in the early container ecosystem was portability.

Different runtimes implemented containers differently.

To solve this issue, the Open Container Initiative (OCI) was created.

OCI defines industry standards for:

OCI Image Specification

Defines how images are built and stored.

Example:

nginx:latest

OCI Runtime Specification

Defines how containers should run.

Benefits include:

Portability
Consistency
Runtime interoperability

This is why an image built on one platform can run on another platform.

Linux Namespaces: The Foundation of Containers

Many engineers believe containers are a special technology.

In reality, containers are primarily Linux features working together.

Namespaces provide isolation.

Each container receives its own view of system resources.

Common namespaces include:

Namespace	Purpose
PID	Process isolation
NET	Network isolation
MNT	Filesystem isolation
IPC	Inter-process communication
UTS	Hostname isolation

For example:

A container may believe it is running process ID 1 even though the host system sees a completely different process ID.

This illusion is created by namespaces.

cgroups: Resource Control for Containers

While namespaces provide isolation, cgroups provide resource control.

cgroups allow Linux to enforce limits on:

CPU
Memory
Disk I/O
Network bandwidth

Example Kubernetes configuration:

resources:
  requests:
    cpu: "500m"
    memory: "512Mi"
  limits:
    cpu: "1"
    memory: "1Gi"

When Kubernetes applies these limits, the runtime configures corresponding cgroups in the Linux kernel.

Without cgroups:

One container could consume all available resources and impact other workloads.

Understanding the Container Lifecycle

A container does not instantly become Running.

It progresses through multiple stages.

When we deploy an application in Kubernetes, we usually focus on the final result:

kubectl get pods

Output:

NAME             READY   STATUS
nginx-pod        1/1     Running

However, reaching the Running state involves multiple steps behind the scenes.

The container runtime, kubelet, Linux kernel, and Kubernetes control plane work together to move the workload through a series of lifecycle stages.

The lifecycle looks like:

Image Pull
      ↓
Container Creation
      ↓
Container Start
      ↓
Readiness Check
      ↓
Running
      ↓
Termination

Let’s examine each stage.

Stage 1: Image Pull

Runtime downloads the image.

Before a container can start, the container runtime must obtain the required image.

Consider the following Deployment:

containers:
- name: web
  image: nginx:latest

When kubelet notices this Pod assignment, it sends a request through CRI to the runtime.

The runtime then:

Contacts the image registry
Authenticates if required
Downloads image layers
Stores them locally

The workflow looks like:

kubelet
    ↓
CRI
    ↓
containerd
    ↓
Docker Hub / ACR / ECR
    ↓
Download Image

What Is Actually Downloaded?

A container image consists of multiple layers.

Example:

nginx:latest
├── Base Linux Layer
├── Libraries Layer
├── Nginx Layer
└── Configuration Layer

The runtime downloads only the layers that do not already exist on the node.

This makes image pulls faster and more efficient.

Common Failures During Image Pull

This stage is responsible for errors such as:

ImagePullBackOff
ErrImagePull

Common causes include:

Incorrect image name
Wrong image tag
Registry unavailable
Authentication failures
Network connectivity issues

Practical Verification

View image pull events:

kubectl describe pod <pod-name>

Example:

Pulling image "nginx:latest"
Successfully pulled image

Stage 2: Container Creation

Namespaces and cgroups are configured.

Once the image exists locally, the runtime begins creating the container environment.

This stage is often misunderstood.

The runtime does not immediately start the application.

Instead, it first prepares an isolated execution environment.

This includes:

Creating Linux namespaces
Configuring cgroups
Mounting filesystems
Creating network interfaces
Preparing the Pod sandbox

Think of this stage as preparing a new apartment before someone moves in.

The apartment must exist before the tenant can live there.

Namespaces Created

The runtime typically creates:

Namespace	Purpose
PID	Process isolation
NET	Network isolation
MNT	Filesystem isolation
IPC	Inter-process communication
UTS	Hostname isolation

This ensures the container cannot directly interfere with other workloads.

cgroups Configured

The runtime applies resource limits.

Example:

resources:
  limits:
    cpu: "1"
    memory: "1Gi"

The runtime converts these Kubernetes settings into Linux cgroups.

This prevents containers from consuming unlimited resources.

Pod Sandbox Creation

Before containers start, Kubernetes creates a Pod Sandbox.

The Pod Sandbox provides:

Shared network namespace
Shared storage
Shared process environment

All containers inside the Pod use this sandbox.

Stage 3: Container Start

Application process launches.

After the environment is prepared, the runtime starts the application process.

Suppose the image contains:

nginx

The runtime launches:

nginx

inside the isolated container environment.

Internally:

containerd
     ↓
containerd-shim
     ↓
runc
     ↓
Linux Kernel
     ↓
Application Process

The process now begins executing.

What Happens Next?

The application:

Loads configuration
Opens ports
Connects to databases
Initializes services

At this stage the container is technically running.

However, Kubernetes still does not consider the application ready to receive production traffic.

Stage 4: Readiness Check

Kubernetes verifies application availability.

This is one of the most important stages in production environments.

A running process does not necessarily mean the application is ready.

Example:

Container Started
Database Connection Not Ready

If traffic arrives immediately, users may experience failures.

To prevent this, Kubernetes performs readiness checks.

Example:

readinessProbe:
  httpGet:
    path: /health
    port: 80

Kubernetes repeatedly calls:

http://pod-ip/health

Possible outcomes:

Success

200 OK

Pod becomes:

Ready

Traffic can now be routed.

Failure

500 Internal Server Error

Pod remains:

Not Ready

Traffic is blocked until the application becomes healthy.

Why This Matters

Readiness probes help avoid:

Failed deployments
Application startup issues
Database initialization problems

They are critical for zero-downtime deployments.

Stage 5: Running

Container serves traffic.

Once readiness checks pass, the Pod enters the Running state.

Now the application begins serving real user traffic.

Traffic flow typically looks like:

User
   ↓
Load Balancer
   ↓
Service
   ↓
Ready Pod

At this point:

Image downloaded
Container created
Process started
Health checks passed
Traffic enabled
The workload is now fully operational.

Continuous Monitoring

Even after reaching Running state, Kubernetes continues monitoring:

Container health
Resource consumption
Probe results

This monitoring never stops.

Stage 6: Termination

Container receives a graceful shutdown signal.

Eventually every container reaches the end of its lifecycle.

Termination may occur because:

Deployment update
Scaling operation
Node maintenance
Pod deletion
Application shutdown

Example:

kubectl delete pod nginx-pod

Kubernetes does not immediately kill the container.

Instead it performs a graceful shutdown.

Step 1: SIGTERM Signal

Kubernetes sends:

SIGTERM

to the application.

This tells the application:

Please shut down cleanly.

Applications should use this time to:

Finish requests
Close database connections
Save state
Flush logs

Step 2: Grace Period

Default:

30 seconds

During this period Kubernetes waits for clean termination.

Step 3: SIGKILL

If the application refuses to stop:

SIGKILL

is issued.

This forcibly terminates the process.

Why Graceful Termination Matters

Without proper shutdown handling:

User requests may fail
Data may be lost
Database transactions may become inconsistent

Enterprise applications must always support graceful termination.

Putting It All Together

When you deploy a Pod, the runtime performs the following sequence:

Image Pull
      ↓
Download image from registry
      ↓
Container Creation
      ↓
Namespaces + cgroups configured
      ↓
Container Start
      ↓
Application launched
      ↓
Readiness Check
      ↓
Application verified
      ↓
Running
      ↓
Serving production traffic
      ↓
Termination
      ↓
Graceful shutdown

Understanding this lifecycle is extremely important when troubleshooting issues such as:

ImagePullBackOff
ContainerCreating
CrashLoopBackOff
Failed readiness probes

These states become much easier to diagnose once you understand the runtime’s responsibilities.

Hands-On Lab:

Understanding Container Runtime & CRI Through a Real Business Deployment

So far, we have explored the theory behind Container Runtimes, CRI, containerd, and how Kubernetes interacts with the Linux kernel to run workloads. However, theory alone is rarely enough to build production-level expertise.

In real-world environments, Platform Engineers and DevOps teams are responsible for deploying applications, troubleshooting failures, and ensuring workloads remain available even when infrastructure issues occur. Understanding how container runtimes behave during these scenarios is a critical skill for operating Kubernetes in production.

To bridge the gap between theory and practice, let’s walk through a complete business use case. Imagine your organization is launching a customer-facing application on Azure Kubernetes Service (AKS). As the Platform Engineer, your responsibility is to deploy the workload, monitor its lifecycle, investigate runtime-related failures, and understand exactly how kubelet, CRI, and containerd collaborate behind the scenes.

Throughout this lab, we will not only deploy a real application but also intentionally simulate common production incidents such as ImagePullBackOff, CrashLoopBackOff, and node failures. By observing these scenarios firsthand, you’ll gain a much deeper understanding of how Kubernetes actually runs containers and how to troubleshoot issues when things go wrong.

By the end of this hands-on exercise, you will be able to:

Deploy a workload on Kubernetes
Identify the container runtime used by your cluster
Observe how containerd pulls and starts containers
Understand how kubelet communicates through CRI
Troubleshoot image pull failures and container crashes
Simulate node-level failures and observe Kubernetes recovery

Let’s begin by deploying our customer portal application and tracing every step of the container lifecycle from deployment creation to a running application.

Yes, that’s actually much stronger.

Instead of ending Week 4 with only theory, create a full hands-on business implementation lab where readers deploy an application and observe CRI/containerd in action.

This is exactly the type of content that makes GeekyMukesh different from generic Kubernetes blogs.

Deploying a Customer Portal on AKS

Business Requirement

Imagine your company is launching a new customer-facing portal.

The development team has built a containerized application and pushed the image to Azure Container Registry.

Your task as a Platform Engineer is to deploy it on AKS and understand exactly how the Container Runtime and CRI participate in the deployment lifecycle.

Architecture

Developer
    ↓
Azure Container Registry
    ↓
AKS Cluster
    ↓
Scheduler
    ↓
kubelet
    ↓
CRI
    ↓
containerd
    ↓
Linux Kernel
    ↓
Running Container

Step 1: Create Namespace

kubectl create namespace customer-portal

Verify:

kubectl get ns

Expected:

NAME              STATUS
customer-portal   Active

Step 2: Create Deployment

Create:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: customer-portal
  namespace: customer-portal
spec:
  replicas: 3
  selector:
    matchLabels:
      app: customer-portal
  template:
    metadata:
      labels:
        app: customer-portal
    spec:
      containers:
      - name: customer-portal
        image: nginx:latest
        ports:
        - containerPort: 80

Deploy:

kubectl apply -f deployment.yaml

Step 3: Watch Kubernetes Work

Monitor:

kubectl get pods -n customer-portal -w

Observe:

ContainerCreating
Running

Ask readers:

What happened behind the scenes?

This is where you connect theory with reality.

Internal Flow

When the Pod enters:

ContainerCreating

the following occurs:

Scheduler
    ↓
Node Selected
    ↓
kubelet
    ↓
CRI Request
    ↓
containerd
    ↓
Image Pull
    ↓
Pod Sandbox
    ↓
Container Start

This is CRI in action.

Step 4: Identify the Runtime

Most engineers don’t know what runtime their cluster uses.

Check:

kubectl describe node

Find:

Container Runtime Version

Example:

containerd://1.7.2

Explain:

This confirms AKS uses containerd.

Step 5: SSH Into Node

If using a self-managed cluster:

ssh azureuser@worker-node

Check runtime:

systemctl status containerd

Expected:

active (running)

Now readers see the actual runtime process.

Step 6: Inspect Images Pulled By Runtime

Install crictl:

crictl images

Example:

IMAGE
nginx
pause

Explain:

These images were downloaded by containerd, not Kubernetes.

Step 7: Observe Running Containers

crictl ps

Output:

CONTAINER ID
IMAGE
STATE

Now readers can directly see containers managed by the runtime.

Step 8: Simulate ImagePullBackOff

Edit Deployment:

image: nginx:notexist

Apply:

kubectl apply -f deployment.yaml

Check:

kubectl get pods

Result:

ImagePullBackOff

Investigate Failure

Describe Pod:

kubectl describe pod <pod-name>

Expected:

Failed to pull image

Explain:

The Scheduler worked.

The API Server worked.

The Deployment worked.

The runtime could not retrieve the image.

This is a runtime-level issue.

Step 9: Simulate CrashLoopBackOff

Create:

containers:
- name: crash-demo
  image: busybox
  command: ["sh","-c","exit 1"]

Deploy:

kubectl apply -f crash-demo.yaml

Observe:

kubectl get pods

Result:

CrashLoopBackOff

What Happened?

containerd successfully:

✔ Pulled image

✔ Created container

✔ Started process

Application:

exit 1

Container stopped.

kubelet restarted it.

Loop continues.

Step 10: View Runtime Logs

Check:

kubectl logs <pod>

Explain:

This helps determine if:

Application failed
Runtime failed
Configuration failed

Step 11: Simulate Node Failure

Drain node:

kubectl drain <node-name> --ignore-daemonsets

Observe:

kubectl get pods -o wide

Pods move to another node.

Behind the scenes:

Node Drained
     ↓
Scheduler
     ↓
New Node Selected
     ↓
kubelet
     ↓
CRI
     ↓
containerd
     ↓
Containers Started

Frequently Asked Questions

What is a Container Runtime in Kubernetes?

A container runtime is the software responsible for pulling images, creating containers, and running workloads on a Kubernetes node.

What is CRI in Kubernetes?

CRI (Container Runtime Interface) is a standard API that allows kubelet to communicate with different container runtimes such as containerd and CRI-O.

Does Kubernetes still use Docker?

No. Kubernetes removed Dockershim in v1.24 and now primarily uses CRI-compliant runtimes such as containerd.

What is the difference between containerd and CRI-O?

Both are CRI-compliant runtimes. containerd is the most widely used runtime across AKS, EKS, and GKE, while CRI-O is commonly used with OpenShift.

How do I check which runtime my cluster uses?

Run:

kubectl describe node

and look for the Container Runtime Version field.

Continue Your Kubernetes Journey

If you’re joining this series midway, I strongly recommend reviewing the earlier articles because each week builds upon concepts introduced previously.

🔹 Week 1: Core Kubernetes Objects

🔹 Week 2: Controllers & Desired State

🔹 Week 3: Kubernetes Architecture (The Backbone)

🔹 Complete Kubernetes Learning Roadmap:
https://geekymukesh.com/kubernetes-in-2026-the-ultimate-8-week-learning-roadmap/

In the next article, we’ll explore Kubernetes Security & Policy, where you’ll learn how RBAC, Network Policies, Pod Security Standards, and admission controls protect production clusters.

GeekyMukesh

Week 4: How Kubernetes Actually Runs Containers (Complete CRI Guide)

Introduction

🔗Following the Kubernetes Learning Roadmap?

Why Container Runtimes Matter

The Journey of a Pod

Let’s Reproduce the Scenario

What You Will Learn In This Guide

What Is a Container Runtime?

Responsibilities of a Container Runtime

Image Management

Container Creation

Container Startup

Resource Isolation

The Evolution of Container Runtimes

Docker Shim

What Is CRI? (Container Runtime Interface)

Why Kubernetes Needed CRI

Runtime Independence

Simplified Maintenance

Faster Innovation

Better Stability

How CRI Works

Runtime Service

Image Service

Complete CRI Request Flow

containerd Architecture Deep Dive

containerd

containerd-shim

runc

Linux Kernel

CRI-O Architecture

containerd vs CRI-O

OCI Standards Explained

OCI Image Specification

OCI Runtime Specification

Linux Namespaces: The Foundation of Containers

cgroups: Resource Control for Containers

Understanding the Container Lifecycle

Stage 1: Image Pull

What Is Actually Downloaded?

Common Failures During Image Pull

Practical Verification

Stage 2: Container Creation

Namespaces Created

cgroups Configured

Pod Sandbox Creation

Stage 3: Container Start

What Happens Next?

Stage 4: Readiness Check

Success

Failure

Why This Matters

Stage 5: Running

Continuous Monitoring

Stage 6: Termination

Step 1: SIGTERM Signal

Step 2: Grace Period

Step 3: SIGKILL

Why Graceful Termination Matters

Putting It All Together

Hands-On Lab:

Understanding Container Runtime & CRI Through a Real Business Deployment

Deploying a Customer Portal on AKS

Business Requirement

Architecture

Step 1: Create Namespace

Step 2: Create Deployment

Step 3: Watch Kubernetes Work

What happened behind the scenes?

Internal Flow

Step 4: Identify the Runtime

Step 5: SSH Into Node

Step 6: Inspect Images Pulled By Runtime

Step 7: Observe Running Containers

Step 8: Simulate ImagePullBackOff

Investigate Failure

Step 9: Simulate CrashLoopBackOff

What Happened?

Step 10: View Runtime Logs