Docker and Kubernetes

Objects of Kubernetes 

Kubernetes has a variety of objects that you can use to manage your cluster. Here are some of the key objects: 

  1. Pods: The smallest and simplest Kubernetes object. A pod represents a single instance of a running process in your cluster. 
  2. Services: An abstraction that defines a logical set of pods and a policy by which to access them. 
  3. Deployments: Provide declarative updates to applications, ensuring that the desired number of pod replicas are running. 
  4. ReplicaSets: Ensure that a specified number of pod replicas are running at any given time. 
  5. StatefulSets: Manage the deployment and scaling of a set of pods, and provide guarantees about the ordering and uniqueness of these pods. 
  6. DaemonSets: Ensure that all (or some) nodes run a copy of a pod. 
  7. Jobs: Create one or more pods and ensure that a specified number of them successfully terminate. 
  8. CronJobs: Manage time-based jobs, similar to cron jobs in Unix. 
  9. ConfigMaps: Provide a way to inject configuration data into pods. 
  10. Secrets: Store and manage sensitive information, such as passwords, OAuth tokens, and SSH keys. 
  11. PersistentVolumeClaims (PVCs): Request storage resources from the cluster. 
  12. PersistentVolumes (PVs): Provide storage resources to the cluster. 
  13. Ingress: Manage external access to services in a cluster, typically HTTP. 
  14. NetworkPolicies: Define how pods communicate with each other and other network endpoints. 
  15. ResourceQuotas: Limit the resource consumption per namespace.
  16. LimitRanges: Set constraints on the resource requests and limits for each pod or container in a namespace. 

Deployment file

apiVersion: apps/v1
kind: Deployment
metadata: 
  name: Myapplication-deployment
  labels: 
    app: Myapplication
spec: 
  replicas: 3  # Number of desired pod replicas
  selector: 
    matchLabels: 
      app: Myapplication
  template:
    metadata: 
      labels: 
        app: Myapplication 
    spec: 
      containers: 
      - name: nginx-container 
        image: nginx:latest  # Consider specifying a version for better control
        ports:
        - containerPort: 80

Architecture of Kubernetes

Master Node

The master node is the control plane of the Kubernetes cluster. It manages the cluster and orchestrates the operations across worker nodes. Key components on the master node include:

  1. API Server: The main interface for users and internal components to interact with the cluster. It processes REST commands and manages the cluster state.
  2. Scheduler: Responsible for assigning workloads (Pods) to worker nodes based on resource availability and other constraints.
  3. Controller Manager: Runs various controllers that manage the state of the cluster, such as the Replication Controller, Node Controller, and Endpoint Controller.
  4. etcd: A distributed key-value store that holds the state of the cluster and the configuration data

Worker node

Worker nodes are the machines that run the containerized applications. Each worker node has the following components:

  1. kubelet: An agent that runs on each worker node and ensures that containers are running and healthy. It communicates with the master node to report the status of the node and manage the containers
  2. Container Runtime: The software that runs the containers, such as Docker or containerd.
  3. Kubeproxy: A network proxy that maintains network rules on nodes and enables communication between Pods

Docker file sample

simple one

FROM ubuntu
RUN yum install -y git
CMD ["echo" "your image created" ]
#Getting the image form docker hub
FROM node:18-alpine

#Directory we will do all our work inside this container
WORKDIR /app

#copy the workdir form current directory(.) to container work directory (.) this means it will copy in the container /app dir

COPY . .

# We have to install all our packages using RUN

RUN yarn install --production

#default command to run when a container starts
CMD ["node", "src/index.js"]

#Expose our application using port

EXPOSE 3000

Multi stage dockerfile

A multi-stage build in Docker is a technique where multiple stages are used in a single Dockerfile to separate the build and runtime environments. This approach allows you to copy only the final build artifacts (like a compiled binary or packaged application) into the runtime image, significantly reducing the image size and improving security. (In first stage package will create and in second stage only that package will copy not any unnecessary file, security also good bcz unnecessary file not copying)

# Stage 1: Get the image from Docker Hub
FROM node:18-alpine AS builder

# Set the working directory
WORKDIR /app

# Copy package.json and package-lock.json
COPY package*.json ./

# Install dependencies
RUN npm install

# Copy the rest of the application code
COPY . .

# Build the application
RUN npm run build

# Stage 2: Serve with Nginx
FROM nginx:alpine

# Copy the build files from the previous stage
COPY --from=builder /app/build /usr/share/nginx/html

# Copy custom Nginx configuration file
COPY nginx.conf /etc/nginx/nginx.conf

# Expose port 80
EXPOSE 80

# Start Nginx
CMD ["nginx", "-g", "daemon off;"]

Above FROM as builder is used for COPY –from=builder from second stage so the only package will copy

Removing container forcefully

docker rm <con id> --force
#remove image forcefully
docker rmi <image id> --force

we can delete the image of running container forcefully and it will not affect the container

Dockerfile instruction

  • FROM: Specifies the base image to use for building the new image.
  • RUN: Executes a command within the image during the build process.
  • COPY: Copies files or directories from the host machine to the image.
  • ADD: Similar to COPY, but can also fetch files from URLs and extract archives.
  • WORKDIR: Sets the working directory for subsequent instructions.
  • ENV: Sets environment variables within the image.
  • EXPOSE: Informs Docker that the container will listen on the specified network port(s) at runtime.
  • CMD: Specifies the default command to run when the container starts.
  • ENTRYPOINT: Similar to CMD, but sets a default command that cannot be overridden.
  • VOLUME: Creates a mount point in the container for accessing data from the host or other containers.
  • USER: Sets the user or UID to use when running commands in the container.
  • ARG: Defines a build-time variable that can be passed in during the build process.
  • LABEL: Adds metadata to the image.
  • ONBUILD: Specifies commands to be executed when the image is used as a base for another image.
  • SHELL: Overrides the default shell used for the RUN, CMD, and ENTRYPOINT instructions.
  • HEALTHCHECK: Defines a health check command for the container.

Kubernetes yaml fine(manifast) with declarative way for POD

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80

Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80

Imperative command for installing nginx

kubectl run nginx --image=nginx --port=80

Replicaset yaml file

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  # modify replicas according to your case
  replicas: 3
  selector:
    matchLabels:
      tier: frontend
  template:
    metadata:
      labels:
        tier: frontend
    spec:
      containers:
      - name: php-redis
        image: nginx

What is the difference between replicaset and replication controller ?

ReplicationController: is the traditional way to ensure a specified number of pod replicas are running at any given time. It uses simple equalitybased selectors to manage pods, making it suitable for basic replication needs. if nodes health is not good or failed it will not schedule that pod in any other node

However, the ReplicaSet is the more advanced and flexible successor. It supports both set-based and equality-based selectors, offering greater control and flexibility. This makes ReplicaSets more versatile and capable of handling complex scenarios, often within the context of Deployments. Essentially, while both serve to maintain desired pod counts, ReplicaSets bring in a higher degree of sophistication and adaptability.

Service manifest

Service in Kubernetes is a stable endpoint to expose your applications running in Pods. It provides a way to access a set of Pods .

apiVersion: v1
kind: Service
metadata:
  name: my-app
  levels:
    name: my-app
spec:
  type: NodePort
  ports:
     port: 80
     targetport: 80
     nodeport: 30001
  selector:
      name: -my-app


Load balancer service

apiVersion: v1
kind: Service
metadata:
  name: my-app
  levels:
    name: my-app
spec:
  type: loadbalancer
  ports:
    port:80
  selector:
    name: my-app

Imperative service

kubectl expose deployment <deployment-name> --type=NodePort --port=80 --name=<service-name>

Namespace

apiVersion: v1
kind: NameSpace
metadata:
  name: My-NewSpace

Scale the app with replica

kubectl scale --replicas=2 <app> -n My-NewSpace

Daemonset

DaemonSet in Kubernetes ensures that a specific Pod runs on each node in the cluster. It’s often used for system daemons or services that need to run on every node, like log collectors or monitoring agents

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: nginx-daemonset
spec:
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80

HowTo connect to a private Docker Hub repository in Kubernetes and pull images ?

1. Create a Docker Registry Secret

  • Use the kubectl create secret command to store your Docker Hub credentials in Kubernetes as a secret. Replace <username>, <password>, and <email> with your Docker Hub credentials.
   kubectl create secret docker-registry <secret-name> \
     --docker-server=https://index.docker.io/v1/ \
     --docker-username=<username> \
     --docker-password=<password> \
     --docker-email=<email>

Example:

  kubectl create secret docker-registry ron-doc-cred \
     --docker-server=https://index.docker.io/v1/ \
     --docker-username=ranjandalai4049 \
     --docker-password=Ronny@4049 \
     --docker-email=ranjandalai4049@gmail.com

2. Reference the Secret in Your Pod/Deployment

  • In your Kubernetes Deployment or Pod configuration, add the secret under imagePullSecrets. This tells Kubernetes to use the credentials stored in the secret to pull images from the private Docker Hub repository. Example YAML:
   apiVersion: apps/v1
   kind: Deployment
   metadata:
     name: my-app
   spec:
     replicas: 1
     selector:
       matchLabels:
         app: my-app
     template:
       metadata:
         labels:
           app: my-app
       spec:
         containers:
         - name: my-container
           image: <your-private-repo>/<image-name>:<tag>
         imagePullSecrets:
         - name: ron-doc-cred

3. Apply the Configuration

  • Apply your YAML file to create the deployment, which will use the secret to pull the private Docker image.
   kubectl apply -f <your-deployment-file>.yaml

Difference between loadbalancer, clusterIP and nodeport ?

NodePort: Exposes a service on each node’s IP at a static port. Accessible externally via <NodeIP>:<NodePort>. it only used internally within the organization who have access to youe ec2 or VPC

LoadBalancer: Provisions an external load balancer to distribute traffic. Provides a single external IP or DNS for access. it used for external or production traffic

Cluster IP: it can accessed by who is having access to the kubernetes cluster network

ExternalName: Maps the service to the contents of the externalName field (e.g., a DNS name).

Diference between loadbalancer and ingress ?

LoadBalancer: Directly provisions an external load balancer from a cloud provider to expose a service. Provides a single external IP for direct access, simplifying service exposure but can be costlier.

Ingress: Acts as a smart router, managing and directing external HTTP/HTTPS traffic to multiple services within the cluster. Offers features like SSL termination, URL path-based routing, and virtual hosting, adding flexibility and efficiency to traffic management.

Ingress:

Problem solved

  1. Enterprise and TLS(Transport Layer Security) load balancer
  2. Use load balancer for each service id it 1000 then 1000 load balancer we need to create(so charge is high in cloud provider)

What is ingress in kubernetes ?

Ingress allows multiple services to be exposed using a single IP address. By creating an Ingress resource in Kubernetes and linking it to an Ingress controller (which acts like an enterprise-level load balancer), you can manage traffic routing for multiple microservices efficiently. Ingress supports path-based and host-based routing, enabling more complex routing configurations, and it can also handle TLS termination, enhancing security.

Why ingress is better than loadbalancer service ?

Using a LoadBalancer service typically requires provisioning a separate load balancer for each microservice, leading to higher costs due to multiple external IP addresses. In contrast, Ingress allows you to expose multiple services through a single IP address, significantly reducing costs and simplifying management. Additionally, Ingress provides more advanced routing capabilities compared to standard LoadBalancer services.

Path-Based Routing

Path-based routing directs traffic based on the URL path of incoming requests. You can configure rules to route requests to different services depending on the path specified in the URL.

Example:

  • If you have an Ingress rule configured for the following paths:
    • /api → routed to api-service
    • /app → routed to app-service

In this case:

  • A request to http://example.com/api will be forwarded to the api-service.
  • A request to http://example.com/app will be directed to the app-service.

Host-Based Routing

Host-based routing directs traffic based on the Host header of incoming requests. This allows you to expose multiple services under different domain names or subdomains using the same IP address.

Example:

  • If you have an Ingress rule for the following hosts:
    • api.example.com → routed to api-service
    • app.example.com → routed to app-service

In this case:

  • A request to http://api.example.com will be routed to the api-service.
  • A request to http://app.example.com will go to the app-service.

Why ingress is better than loadbalancer service ?

  1. Cost Efficiency: Ingress is generally more cost-effective than LoadBalancer services because it uses a single load balancer to manage multiple services, whereas LoadBalancer services create a separate load balancer for each service.
  2. Centralized Configuration: Ingress provides a centralized way to manage external access to multiple services within a cluster. You can define rules for routing and load balancing in a single resource, making it easier to manage and maintain.
  3. Advanced Routing Features: Ingress supports more advanced routing features, such as URL-based routing, path-based routing, and host-based routing. This allows you to direct traffic to different services based on the request URL or host header.
  4. SSL Termination: Ingress can handle SSL termination, allowing you to manage SSL certificates and secure communication in a centralized manner. This simplifies the process of securing your applications.
  5. Path-Based Routing: Ingress allows you to route traffic based on the request path, enabling you to serve different content or applications based on the URL path.

Deployment strategies in kubernetes

By default kubernetes support Rolling update

Rolling Update (Default): Updates Pods incrementally, replacing old versions of the application with new ones gradually. This ensures that there’s always a certain number of Pods available. Use Case: Ideal for most production workloads where you want to update without downtime.

Blue-Green Deployment: Runs two environments (blue = current, green = new), switching traffic from blue to green after testing. Ideal for quick rollbacks and fully tested updates.

Canary Deployment: Sends a small percentage of traffic to the new version (canary), gradually increasing if stable. Good for risk management and gradual rollout.

Recreate Deployment: Stops old Pods, then starts new ones, causing downtime. Useful for non-critical workloads where downtime is acceptable.

A/B Testing: Routes specific users to the new version for testing purposes. Ideal for experiments and targeted feedback.

What are the common error we saw in kubernetes pod ?

  1. CrashLoopBackOff: Indicates that a pod repeatedly crashes after starting. Causes could include application crashes or misconfigurations.
  2. ImagePullBackOff: Means Kubernetes can’t pull the container image. This could be due to authentication issues, image name errors, or missing images.
  3. ErrImagePull: Another image-related error, indicating problems pulling the specified image.
  4. ContainerCreating: If a pod stays in this state for a long time, it might be due to networking issues, insufficient resources, or volume mounting problems.
  5. Init:Error: Issues in the init containers, which must run successfully before the main containers start.
  6. PodUnschedulable: Indicates that the scheduler can’t find a node with enough resources to run the pod.
  7. OOMKilled: The pod was terminated because it exceeded its memory limit.

What is service in kubernetes ?

Service in Kubernetes is an abstraction that defines a logical set of Pods and a policy by which to access them. Services provide a stable endpoint (like a DNS name) to access your Pods, abstracting away the underlying complexities.

What is Deployment in kubernetes ?

Deployment in Kubernetes is a higherlevel abstraction that manages the rollout and scaling of a set of Pods. It’s the go-to way to declare and update your applications in a reliable and predictable manner.

What is configmap in kubernetes ?

ConfigMap in Kubernetes is a resource used to store non-confidential configuration data in key-value pairs. It allows you to decouple configuration artifacts from image content to keep containerized applications portable.

What is demonsets in kubernetes ?

DaemonSet in Kubernetes ensures that a copy of a specific pod runs on all (or some) nodes in the cluster. It’s commonly used for node-level services such as log collection, monitoring, and other system daemons.

What is Helm ?

Helm charts are simply kubernetes manifests combined into a single package that can be advertised as kubernetes cluster

How To connect two containers in Docker, such as an application container and a MySQL container ?

Step 1: Create a Custom Network

Create a custom network so that your containers can communicate with each other.

docker network create twotier

Step 2: Run the MySQL Container

Run the MySQL container on the custom network you just created.

docker run -d \
    --name mysql \
    -v mysql-data:/var/lib/mysql \
    --network=twotier \
    -e MYSQL_DATABASE=mydb \
    -e MYSQL_ROOT_PASSWORD=admin \
    -p 3306:3306 \
    mysql:5.7

Step 3: Run the Application Container

Run your application container on the same custom network and connect it to the MySQL container.

docker run -d \
    --name flaskapp \
    --network=twotier \
    -e MYSQL_HOST=mysql \
    -e MYSQL_USER=root \
    -e MYSQL_PASSWORD=admin \
    -e MYSQL_DB=mydb \
    -p 5000:5000 \
    flaskapp:latest

To check the docker network : docker network ls to check the details of network: docker network inspect <cnid>

Can we deploy container in controlplane ?

We can deploy containers on the control plane (master nodes) in a Kubernetes cluster, but it’s generally not recommended due to the risk of impacting cluster stability and performance. we can Temporarily remove the taints that prevent pods from being scheduled on master nodes:

kubectl taint nodes --all node-role.kubernetes.io/master-

Add vs Copy in Dockerfile ?

  • Use COPY when you are just copying files or directories into the image.
  • Use ADD only when you need its extra functionality (e.g., extracting tar files or downloading from a URL).

How to take backup and restore container ?

Backup

Create an Image from the Container: docker commit <container_id> <backup_image_name>
Save the Image to a File: docker save -o /path/to/directory/backup_file.tar

Restore

Load the Image from the File: docker load -i /path/to/directory/<backup_file.tar>
Run a Container from the Loaded Image: docker run -d –name <new_container_name> <backup_image_name>

How to get the access eks in a new virtual machine ?

  • Install AWS CLI
  • Install kubectl
  • Configure AWS CLI
  • Update kubeconfig: aws eks –region <region> update-kubeconfig –name <cluster-name>

Running this command configures your kubeconfig file to include the necessary information to securely connect to and manage your EKS cluster using kubectl. It ensures that you can perform Kubernetes operations on your EKS cluster from your local machine or any environment where the command is executed.

  • Verify Connection: kubectl get nodes

How do you declare context in Kubernetes?

In Kubernetes, contexts define the cluster, user, and namespace to be used for a particular kubectl command. A context in Kubernetes allows you to easily switch between different clusters or user configurations without needing to repeatedly specify cluster details like the server URL, authentication credentials, or the namespace you want to operate in.

Set a New Context: 
kubectl config set-context <context-name> --cluster=<cluster-name> --user=<user-name> --namespace=<namespace-name>
List All Contexts: 
kubectl config get-contexts
Chenge the k8s cluster
kubectl config use-context example-context

What happens when you delete /var/lib/docker/overlay?

Deleting the /var/lib/docker/overlay directory can have significant consequences, as it is a critical part of the Docker storage system. This directory is used by Docker to store the overlay filesystem, which contains the actual data for the Docker containers.

Data Loss: If you delete this directory, you will lose all the data stored by the running containers that use the OverlayFS storage driver.

Container and Image Data: Docker uses layers to build images. Deleting this directory will result in the loss of any containers or images that are using those layers. Essentially, Docker images may become unusable if they rely on the data in this directory.

Containers Become Unusable: If you have running containers that use OverlayFS, they will not be able to function properly because their filesystem is deleted. You might need to recreate containers or restart them, and any in-progress data (logs, databases, etc.) will be lost.

Why are Kubernetes resources like `Pod`, `ReplicaSet`, and `Deployment` defined separately, even though a `Deployment` can manage both Pods and ReplicaSets?

Separation of Concerns: Different resources serve different concerns. While Deployment handles deployment strategies, scaling, and versioning, ReplicaSet focuses on ensuring a specified number of Pods are running. Pod is a fundamental building block, defining individual container configurations. By keeping these resources separate, Kubernetes gives users the flexibility to choose the right abstraction for their specific needs.

Custom Workflows: There are cases where you might need to manage Pods and ReplicaSets directly without the overhead of a Deployment. For example, some advanced use cases involve manually controlling the scaling process or working with StatefulSets (which manage stateful workloads), where you may need more control over the underlying components.

Modularity and Reusability: Kubernetes is designed with modularity in mind. By having resources like Pod, ReplicaSet, and Deployment as separate entities, users can mix and match them to suit their application’s architecture. For instance, ReplicaSet can be used independently for managing stateless workloads, while Pod can be directly used for testing or special configurations.

What is stateful and stateless application ?

  • Stateful Application: An application that maintains state or data across sessions or requests. It remembers previous interactions, typically through databases or persistent storage. Examples include databases (e.g., MySQL), applications with user login sessions, and services that require session tracking.
  • Stateless Application(Deployment): An application that does not maintain state between sessions or requests. Each request is independent, and no data is stored between requests. Examples include web servers or APIs that process each request independently, like RESTful APIs.

How does Kubernetes handle Pod failures?

Pod Restart Policy: Kubernetes has a Pod restart policy that controls the behavior of Pods when they fail. The restart policy is defined within the Pod specification.

Replication Controller / ReplicaSet: If you are using Deployments, ReplicaSets, or ReplicationControllers, Kubernetes ensures that the desired number of replicas of a Pod are always running. If a Pod fails or gets terminated (either due to node failure or other reasons), the controller automatically creates a new Pod to maintain the specified number of replicas.

Health Checks: Liveness and Readiness Probes:

  • Liveness Probe: Kubernetes uses this probe to check if a Pod is still running and healthy. If the liveness probe fails (i.e., the Pod is considered “unhealthy”), Kubernetes will restart the Pod automatically.
  • Readiness Probe: This probe checks if a Pod is ready to serve traffic. If the readiness probe fails, Kubernetes will stop routing traffic to that Pod until it is ready again. The Pod itself is not restarted.

What is service mesh ?

A service mesh is a dedicated infrastructure layer that manages communication between microservices. It provides features like service discovery, load balancing, traffic management, security (encryption, authentication, authorization), observability (logging, metrics, tracing), and retries.

CMD VS ENTRYPOINT in docker

CMD

It can be overridden by arguments provided in the docker run command. e.g when we are running our container if we give any extra command while running then it will over ride. the docker file CMD command will not execute

docker run -dp 3000:3000 <image> ping "google.com"
#here the CMD command mentioned in the docker file will not run but will run ping command

ENTRYPOINT

Cannot be easily overridden, unless explicitly done with the --entrypoint flag. e.g when we are running our container if we give any extra command while running then it will not over ride. the docker file ENTRYPOINT command will execute

docker run -dp 3000:3000 <image> ping "google.com"
#here the ENTRYPOINT command mentioned in the docker file will run not ping command

How Kubelet work in k8s ?

The kubelet is an agent running on each node in a Kubernetes cluster. It ensures that the containers in the pods are running and healthy, monitors their status, and manages their lifecycle based on the pod specifications provided by the Kubernetes API server. The kubelet also handles tasks like health checks, logging, and resource management for containers.

What are dangles images ?

Dangling images are unused Docker images that no longer have any tag associated with them. These typically occur during the image build process when old layers are replaced by new ones. They are listed as <none>:<none> in the output of docker images.

What is Scratch image in docker ?

scratch is a special, empty base image provided by Docker. It contains no operating system files, libraries, or utilities — it’s entirely blank. It is often used to create ultra-lightweight containers by adding only the application and its direct dependencies. we can When we want complete control over what is included in the image.

What is the difference between docker attach vs exec ?

Attach: Attaches your terminal to the main process of a running container. we can see the running process of the container and once we exit form it, container will stop.

EXEC: Executes a new command or process inside an already running container.

Difference between -p and -P in docker ?

-p: Maps a container port to a specific host port.

-P: Maps all exposed ports in the container to random available ports on the host

Kubernetes Cluster Upgrade[Master & Worker Nodes]

  • Backup the Kubernetes cluster
  • Upgrade the primary control plane node.
  • Upgrade additional control plane nodes.
  • Upgrade worker nodes.

Upgrade master node

  • Upgrade kubeadm on the Control Plane node
  • Drain the Control Plane node
  • Plan the upgrade (kubeadm upgrade plan)
  • Apply the upgrade (kubeadm upgrade apply)
  • Upgrade kubelet & kubectl on the control Plane node
  • Uncordon the Control Plane node
  •  Upgrade kubelet and kubectl

Upgrade worker nodes

  • Drain the node
  • Upgrade kubeadm on the node
  • cordon the node
  • Upgrade the kubelet configuration (kubeadm upgrade node)
  • Upgrade kubelet & kubectl
  • Uncordon the node

Affinity, Taint and Toleration

Node Selector: Schedule the pod in a particular node which having node name: arm-64, ex: suppose our pod want to run in a particular node which having arm-64 or any windows machine that time we can use nodeSelector

NodeAffinity: 2 types

  • preferredDuringSchedulingIgnoredDuringExecution: specifies a preference for scheduling on nodes that match the specified label, but it’s not a strict requirement.
  • requiredDuringSchedulingIgnoredDuringExecution: specifies that the pod can only be scheduled on nodes that match the specified label.

Taints and Toleration

Taint: Taints are applied to nodes and indicate that pods with certain properties should not be scheduled on those nodes (When we don’t want to schedule in particular node)
Toleration: When we applied taint for one node and still want to schedule on that node we can add toleration with same key value pair so that it will tolerate and schedule

What are the different networking types in Docker?

What are targets in Docker-Compose ?

In Docker Compose, targets allow you to specify which stage of a multi-stage build to use when building an image. This can be useful if you only want to build up to a certain stage rather than building all the way to the final one.

Difference between docker engine vs docker container ?

  • Docker Engine: Docker Engine is the core software that runs and manages containers on a host machine. It provides the runtime environment for building, running, and managing containers.
  • Docker Container: A Docker Container is a lightweight, isolated, and executable instance of an application created from a Docker image. It runs on top of the Docker Engine.

Network Policy in k8s

By default all the pods can communicate with each other, but we can add network policy to restrict traffic like only fronted pod can communicate with backend and backend pod can communicate with database pod

A Network Policy in Kubernetes is a resource used to control traffic flow at the pod level. It defines ingress (incoming) and egress (outgoing) rules to specify which pods or IPs can communicate with each other. It improves security by restricting unauthorized access.

Network Policy for the 3 tier Architecture

In a three-tier application (typically consisting of frontend, backend, and database) deployed in Kubernetes, Network Policies are used to control the communication between these tiers. This ensures security by restricting unnecessary traffic and only allowing connections explicitly defined in the policy.

  1. Frontend Tier:
    • Handles user interactions (e.g., a web interface).
    • Typically exposed to the internet or other external clients via a service (LoadBalancer/Ingress).
  2. Backend Tier:
    • Handles business logic.
    • Communicates with both the frontend and the database.
  3. Database Tier:
    • Stores and retrieves application data.
    • Only accessible to the backend.

How Network Policies Work in Kubernetes

Network Policies are applied at the pod level, using labels to define which pods can communicate with one another. These policies define:

  1. Ingress Rules: Allow incoming traffic to the pod.
  2. Egress Rules: Allow outgoing traffic from the pod.

If no Network Policy exists, all pods can communicate with each other freely. However, when a policy is applied, it blocks all traffic except what is explicitly allowed.

Example: Network Policy for a Three-Tier Application

1. Labels for Pods

Each tier is labeled to distinguish the pods:

  • Frontend: tier: frontend
  • Backend: tier: backend
  • Database: tier: database

2. Network Policy Example

Frontend Tier Policy: Allow only external traffic (from the internet) to access the frontend pods:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-ingress
  namespace: app-namespace
spec:
  podSelector:
    matchLabels:
      tier: frontend
  policyTypes:
    - Ingress
  ingress:
    - from:
        - ipBlock:
            cidr: 0.0.0.0/0  # Allow traffic from anywhere
      ports:
        - protocol: TCP
          port: 80          # Allow traffic on port 80 (HTTP)

Backend Tier Policy: Allow only the frontend tier to send traffic to the backend pods:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-to-backend
  namespace: app-namespace
spec:
  podSelector:
    matchLabels:
      tier: backend
  policyTypes:
    - Ingress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              tier: frontend  # Only allow traffic from frontend pods
      ports:
        - protocol: TCP
          port: 8080         # Backend service port

Database Tier Policy: Allow only the backend tier to connect to the database pods:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-backend-to-database
  namespace: app-namespace
spec:
  podSelector:
    matchLabels:
      tier: database
  policyTypes:
    - Ingress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              tier: backend  # Only allow traffic from backend pods
      ports:
        - protocol: TCP
          port: 3306         # Database service port (e.g., MySQL)

3. Traffic Flow

  • Frontend:
    • Receives external traffic (e.g., user requests).
    • Can send traffic only to the backend.
  • Backend:
    • Accepts traffic only from the frontend.
    • Sends traffic to the database.
  • Database:
    • Accepts traffic only from the backend.
    • Does not send traffic anywhere.

Persistent Volume and Persistent Volume claim

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: test
  hostPath:
    path: /tmp/test
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: test

It will claim the persistent volume by storageClassname, both need to be same

What is headless service

A headless service in Kubernetes is a type of service that doesn’t have a cluster IP and doesn’t load-balance traffic. Instead, it allows clients to directly access the IP addresses of the individual pods it manages, making it ideal for stateful applications needing stable network identities. mostly used it for database

What is Operator in k8s ?

A Kubernetes Operator automates the management of an application, including deployment, scaling, upgrades, configuration, and backup tasks. It uses Kubernetes APIs and custom resources to handle workloads that require operational knowledge.

Equality based vs set based selector

  • Equality-Based Selector: Matches labels based on exact equality or inequality (=, ==, or !=).
  • It is simpler and directly matches labels key-value pairs.
  • Equality-based selectors are used when you want to match labels exactly. Most common for Services, ReplicationControllers, and Deployments.
apiVersion: v1
kind: Pod
metadata:
  name: frontend-pod
  labels:
    app: frontend
---
apiVersion: v1
kind: Service
metadata:
  name: frontend-service
spec:
  selector:
    app: frontend   # Equality-based selector
  ports:
    - protocol: TCP
      port: 80
  • Set-Based Selector: Matches labels based on a set of values (e.g., in, notin, exists).
  • It is more expressive and powerful, allowing matching on multiple conditions.
  • Commonly used with Selectors in Deployments, StatefulSets, and Node Affinity Rules.
apiVersion: v1
kind: Service
metadata:
  name: multi-service
spec:
  selector:
    matchExpressions:
      - key: app
        operator: In
        values: [frontend, backend]
  ports:
    - protocol: TCP
      port: 80

Load balancer vs external ip

  • Load Balancer: A Load Balancer is a service type in Kubernetes (Service Type: LoadBalancer) that exposes your application externally using a cloud provider’s load balancer.
  • The Kubernetes API creates an external load balancer in the cloud provider (e.g., AWS ELB, Azure Load Balancer, GCP Load Balancer).
  • It assigns an external IP to the load balancer.
  • The load balancer routes external traffic to the Kubernetes service, which then forwards the traffic to the appropriate Pods.
  • External IP: The External IP acts as a public-facing IP for accessing the service.
  • It requires manual assignment of an IP (unlike a load balancer, which is provisioned automatically).
apiVersion: v1
kind: Service
metadata:
  name: my-externalip-service
spec:
  selector:
    app: my-app
  type: ClusterIP
  externalIPs:
    - 203.0.113.25
  ports:
    - protocol: TCP
      port: 80
      targetPort: 808
  • The External IP (203.0.113.25) is specified manually and Traffic sent to this external IP is routed to the service, which forwards it to the Pods.
  • Common in on-premises environments where no cloud load balancer exists.

What are Kubernetes labels and selectors?

  • Kubernetes Labels: Labels are key-value pairs attached to Kubernetes objects, such as pods, services, nodes, and deployments. They are used to organize and identify resources based on meaningful attributes.
  • Kubernetes Selectors: Selectors are used to query Kubernetes objects based on their labels. They help match resources with specific attributes.
Use Cases
  • Pods and Services: Services use selectors to target pods with specific labels, enabling load balancing.
  • Deployments and ReplicaSets: Deployments use selectors to manage the pods they create.
  • Node Affinity: Match nodes with specific labels to schedule pods appropriately.
Which CLI command is used to switch between Kubernetes clusters?

To switch between Kubernetes clusters using the CLI, you can use the kubectl config use-context command. Here’s how it works:

Switch to a Specific Context:

kubectl config use-context <staging-cluster>

Why is Metrics Server is Required for horizontal pod autoscaling ?

The Horizontal Pod Autoscaler uses metrics (like CPU and memory utilization) to scale the number of pods up or down based on the observed load. The Metrics Server is responsible for collecting these metrics and exposing them via the Kubernetes API. Without Metrics Server HPA will not work.

  • Metrics Collection: The Metrics Server collects resource usage data (CPU, memory) from nodes and pods in the cluster.
  • API Exposure: It then exposes this data via the Kubernetes API, allowing HPA to access it and make scaling decisions.
  • Scaling Decision: The HPA uses this data (such as CPU utilization) to determine whether to scale the application up or down.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

I want to give my custom header using nginx ingress controller in k8s, how can give this ?

To add custom headers using the NGINX Ingress Controller in Kubernetes, you can use annotations in the Ingress resource. Specifically, the nginx.ingress.kubernetes.io/configuration-snippet annotation allows you to insert custom configuration snippets into the NGINX configuration for a particular Ingress.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: example-ingress
  namespace: default
  annotations:
    nginx.ingress.kubernetes.io/configuration-snippet: |
      add_header X-Custom-Header "MyCustomValue";
      add_header X-Another-Header "AnotherValue";
spec:
  rules:
  - host: example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: example-service
            port:
              number: 80

Can you copy a file form local to runing container?

Yes, you can copy a file from your local machine to a running Docker container using the docker cp command.

docker cp <local_file_path> <container_id>:<container_path>

I want to give resource quota to namespace and pod, how to do that ?

Namespace
apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-quota
  namespace: my-namespace
spec:
  hard:
    requests.cpu: "1"
    requests.memory: 1Gi
    limits.cpu: "2"
    limits.memory: 2Gi
Pod
apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  namespace: my-namespace
spec:
  containers:
  - name: my-container
    image: my-image
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"

Why we need CNI like fannel, calico etc ?

  • Pod Networking: CNIs enable communication between pods across nodes in a Kubernetes cluster, ensuring that each pod can reach others regardless of where they are deployed. suppose one pod is deployed in other node with one IP and one is deployed in other node, CNI will handle the connection of both can talk to each other or it will not assign the same IP
  • Network Isolation: CNIs enforce isolation between different namespaces and pods, providing security and preventing unauthorized access or data breaches.
  • Scalability: They handle large-scale deployments by efficiently managing network traffic, ensuring that the cluster can scale up or down without network bottlenecks.

3 best practice in docker

  1. Always create dockerfile in a empty directory so it will take less space
  2. Always use official image whenever possible
  3. Give specific tag expect of ‘latest’
  4. Use multi stage build so it will reduce the image size

Difference between docker stop and kill ?

  • Docker Stop: Best used when you want to gracefully stop a container and give it a chance to shut down cleanly.
  • Docker Kill: Forceful Termination: Sends a SIGKILL signal immediately, terminating the process without giving it a chance to perform any cleanup.

RDS vs DynomoDB

Amazon RDS is a managed SQL database service that supports multiple engines like MySQL and PostgreSQL, ideal for applications needing complex queries and transactions. DynamoDB is a managed NoSQL database service designed for high performance and scalability, perfect for applications requiring flexible data structures and handling massive workloads. In essence, choose RDS for structured data and relational needs, and DynamoDB for unstructured data and high scalability.

Shall i use statefulset of webapplication ? if yes then why we used deployments mostly for webapplication ?

  • Stateless Nature of Web Apps: Most web applications are stateless, meaning they do not store session data or state in the application server itself. Instead, state is managed externally (e.g., in databases, caches like Redis, or session stores).
  • Ease of Scaling: Deployments allow easy horizontal scaling by simply increasing the replica count. New pods can be added without needing stable identities or persistent storage.
  • Rolling Updates: Deployments provide rolling updates out of the box, ensuring zero downtime while updating the application.
  • Simpler Management: Deployments are simpler to manage and work well with stateless services like REST APIs or web servers (e.g., Nginx, Apache, Node.js).
  • Better Load Balancing: Load balancers like Kubernetes Services (e.g., ClusterIP, NodePort) distribute traffic to any available pod. With stateless apps managed by Deployments, all pods are interchangeable, improving efficiency.

Difference between statefulset and Deployments?

The main difference is when the pod is creating in deployment it will create the app name like <nginx> but in statefulset it will create the pod name with the serial number like <nginx0> <nginx1> so on, if one is teminated again it will create with same name

How to add a new worker node to your existing Kubernetes cluster with 1 master and 2 worker nodes ?

  • Ensure the new node has the required software: Docker (or another container runtime), kubeadm, kubectl, and kubelet.
  • Check that the node can communicate with the master and other worker nodes on the cluster network.
  • On the master node, run the following command to generate the token and join command:
    sudo kubeadm token create --print-join-command
  • Run the join command on the new worker node.

Can we create the dockerfile name to any other name?

Yes, you can use a custom name for your Dockerfile, but you need to specify the file name explicitly when building the Docker image using the -f option in the docker build command.

docker build -f MyDockerfile -t my-app:v1.0 .

Share Image via Docker Save and Load

If you want to share Docker images without using a registry, you can export the image to a tarball file and share it directly.

docker save -o <image_name>.tar <image_name_or_id>

If we want to do rolling update without edit the manifest file ?

kubectl set image deployment/my-app my-app=my-app:2.0

Difference between docker create and run ?

  • docker create: prepares the container but does not start it, while docker run creates and starts the container in one step. we can use it when want a container form a image but without starting, we can use docker start command to start it separately.
  • docker run: is the more common command used for most operations, while docker create is used for scenarios where you need to prepare a container but not run it immediately.

How share the docker image without any registry ?

The docker save command is used to save a Docker image into a tarball (compressed archive file), which can then be transferred or stored for later use. This is useful when you want to backup an image or transfer it between systems without using a Docker registry (like Docker Hub).

docker save -o <output_file.tar> <image_name>

The docker load command is used to load a saved image (tarball) back into Docker from the .tar file. This allows you to import a Docker image that was saved with the docker save command into your local Docker repository.

You can use docker save to save an image to a tarball and then use docker load on another machine to import the image. This is helpful when the machines do not have direct access to Docker Hub or a private registry.

How to rollback in k8s ?

# View the revision history
kubectl rollout history deployment/my-deployment

# Rollback to revision 3
kubectl rollout undo deployment/my-deployment --to-revision=3

Use same app with different environment

Create different values.yaml like dev-values.yaml, stage-values.yaml and prod-value.yaml
Create namespace for dev, stage and prod
Install app with different environment
helm install <app-name> . -f dev-values.yaml -n dev

Service selects the pods created by the deployment

  • Deployment Template Labels: The deployment specifies that all pods it creates will have the label app: nginx.
template:
  metadata:
    labels:
      app: nginx
  • Service Selector: The service uses a selector to identify and route traffic to the pods with the matching label.
selector:
  app: nginx

So it needs to mention the same name for the both, then only the traffic go to the pod.

MetalLB vs Ingress

MetalLB is a network load balancer for bare-metal Kubernetes clusters, providing external IPs for services of type LoadBalancer. It operates at Layer 2 (ARP) or Layer 3 (BGP) to distribute traffic to backend pods. It is essential for on-prem environments where cloud-based load balancers are unavailable. (mostly we used for databses in k8s)

Ingress, on the other hand, is a Layer 7 (HTTP/S) traffic router that directs incoming requests to different services based on hostnames, paths, or TLS rules. It requires an Ingress Controller like NGINX or Traefik and is best suited for web applications needing domain-based routing, SSL termination, and centralized access management.

While MetalLB provides an external IP, Ingress manages HTTP/S routing. Both can be used together—MetalLB assigns an external IP to an Ingress Controller, which then handles traffic routing inside the cluster.

Leave a Comment