Persistent Data & Resource Management

Persistent Data

In previous articles, we have seen that if a POD is decommissioned or the container is restarted, all data on the POD file system is erased.

It is a winning approach for all those stateless applications (for example a web front-end ) but it is not for stateful applications where, for example, not registering a record of a DataBase means losing vital information for the service provided.

Kubernetes brilliantly overcomes the obstacle through the use of Persistent Data technology.

It is the yaml file that defines the Persisten Data in the POD through the entries:

  • Volumes describing the volumes available for the POD.
  • VolumeMounts describing the path or usage of the volume (e.g. / mydata /)

Volumes are categorized into three main categories created based on their use:

1- Communication / synchronization

It is the shared volume to synchronize with the images of a remote Git.

The life of the volume is limited to the existence of the POD and the volume can be shared between multiple containers.

2- Persistent Data

To ensure high reliability and best performance, the PODs must be able to move freely between the nodes of the kubernetes cluster.

As a result, volumes that contain persistent and vital application information must always be reachable by the POD.

Kubernetes to ensure visibility supports many types of volumes such as NFS, iSCSI , Amazon’s Elastic Block Store, Azure File and Disk Storage, as well as Google Persistent Disk.

Note1: If you move your POD , Kubernetes can automatically unmount the volume from the old host and make it available on the new one.

3- Host filesystem

Some applications not only need a persistent volume, but also a file system available at the host level. The need is addressed through the hostPath volume (e.g. /var/mygp/).

Resource Management

The cost of operating a machine in a data center is independent of the amount of CPU & RAM that the single VM in operation uses.
On the other hand, ensuring that CPU & RAM resources are distributed in the best possible way within the infrastructure impacts the efficiency of the environment.


Let’s imagine two services. The former uses 20% of the memory of a VM configured with 5GB of RAM, the latter uses 50% of a second VM configured with 4GB RAM.

The total use of RAM memory is 1 + 2 = 3GB of the total 9GB allocated.

Utilization metric ( MU ) is defined as the percentage value between the ratio of the amount of actively used resources and the number of purchased resources.

In our example MU = 3/9 = 33%

In order to control resource usage, Kubernetes allows users to specify two different metrics at the POD level.

  • Resource Request specifies the minimum amount that can be assigned to the resource.
  • Resource limits specify the maximum amount that can be assigned to the application.

The example of Figure 1 shows an example of a resource limit

Figure 1

Kubernets: Health-Check & Port-Forwarding


One of the advanced and most important features of kubernetes is the health-check which allows you to check the integrity of the services.

It is an additional layer that adds to the standard controls that guarantee that the application processes are always running ( livenessProbe ), the application integrity controls ( ReadinessProbe )

The advantage of the health check is that:

  • It is specific for each container.
  • It uses the same logic as the application (for example, loading a web page, pinging a DB ).
  • The livenessProbe determines if the application is running correctly. Otherwise, the application is restarted.
  • The ReadinessProbe describes when the container is ready to satisfy the requests of the users (and therefore of the service)

The configuration of the health -check is done by adding the livenessProbe and ReadinessProbe items to the POD configuration yaml file (see figure 1)

Figure 1


Port forwarding authorizes the service (configured at the POD level) to communicate both with other PODs and with the outside world. Without Port Forwarding, the service is totally isolated.

The simplest example is that of a website. As long as port-forwarding is not started, the pages of the site are not available to users.

In our example ( some-mysql ), after POD has started, the command to enable application port-forwarding on port 8000 is:

kubectl port forward some-mysql 8000: 8000

An article is available on the website that explores the networking issue of Container environments. To read it click here.

In a future article, we will talk about load balancers that help networking management.



Kubernetes: Pods

In previous articles, we have seen that containers are ” abstractplaces where applications run in the form of images.

PODs are the aggregation of multiple containers .

A service is the aggregation of several PODs .

The image in figure 1 shows the concept just explained.

Figure 1


All applications (images) present within the same POD will have the same (shared) IP address and the same Hostname (UTS NameSpace).

Communication between containers within the same HOST occurs through POSIX or System V IPCs

Now imagine you want to provide the ” the-gable-svc ” service, built with two images (containers): a Database and a Front-End.

During the design phase, is it better to design a single POD that contains the two containers ( figure 2 ) or two PODs with a container each ( figure 3 )? ( 1POD x 2 CONTAINER or 2 POD x 1 CONTAINER )

Figure 2

Figure 3

To answer accurately, you need to understand which application needs the most scalability and flexibility.

In our example it is the DB that could require more resources (RAM & CPU) to manage access peaks.

If I had designed a single POD, the increase in resources would involve both applications, effectively not optimizing energy expenditure.

Note 1: CPU & RAM resources are allocated during POD creation.

Is it always better to create more PODs?

This latter statement does not align with K8s’ resilience policy, where PODs should run on different physical hosts ( k8s is a cluster).

To resolve the demise, there is a good rule :

If the service works fine even though the PODs are spread across multiple hosts, then it is better to use multiple PODs (Figure 2).

Figure 4 shows the contents of the mysql-pod.yaml file that creates the POD for the mySQL application.

Figure 4

Let’s see the basic POD management syntax:

  • To start it just run the command: kubectl apply -f mysql-pod.yaml
  • To check its status: kubectl get pods
  • To get all the details: kubectl describe pods some-mysql
  • If we wanted to delete it: kubectl delete -f mysql-pod.yaml

For today it’s all in a little bit where we will talk about Access to the POD, how to copy files and much more.


Kubernetes: The components

In previous articles we have seen some details of how the Kubernetes architecture is built.

Today the working mechanisms of the Kubernetes engine will be described indicating the name of each component; to remain faithful to the comparison of the car engine, we will speak of the camshafts, valves, bearings, … that belong to the Cloud Native

Note 1: The installation of k8s in Datacenter, Cloud, and Laboratory will not be discussed, the network has already made comprehensive tutorials available.

To familiarize yourself with k8s I recommend using Minikube (Linux platform) Docker Desktop (Windows & Mac platform).

Let’s begin!

Kubernetes Master:It is the main node of the cluster on which three processes that are vital for the existence of the cluster run.

  • kube-apiserver
  • kube-controller-manager
  • Kube-scheduler

In the master node, there is also the DataBase etcd, which stores all configurations created in the cluster.

The nodes that take care of running the applications and therefore the services are said worker node. The processes present on the worker node I’m:

  • Kubelet
  • kube-proxy

kubectl : AND’ The official Kubernetes client ( CLI ) through which you can manage the cluster ( Kube-apiserver ) using the API.

Some simple examples of kubectl commands are:

  • kubectl version (indicates the version of k8s installed)
  • kubectl get nodes (find out the number of nodes in the cluster)
  • kubectl describe nodes nodes-1 (shows the health status of the node, the platform on which k8s is running (Google, AWS, ….) and the allocated resources (CPU, RAM)).

Kube-Proxy : He is responsible for managing networking, from Routing to Load Balancing rules.

Note 2 : K8s will try to use them all libraries available at the level of operating system .

Container Runtime : It is the foundation on which the k8s technology rests.

kubernetes supports several runtimes among which we remember, container-d , cri-o , rktlet .

Note 3 : The runtime Docker it has been deprecated in favor of those that use interfaces CRI ; Docker images will still continue to work in the cluster.

The objects Kubernetes base are:

  • Pod
  • Services
  • Volumes
  • Namespace

THE controller provide additional functionality and are:

  • ReplicaSet
  • Deployment
  • StatefulSet
  • DaemonSet
  • Job

Between Deployment it is imperative to mention Kube-DNS which provides name resolution services. Since kubernetes version 1.2 the name has changed to Core-dns.

Add-On : they are used to configure further cluster functions and are placed inside the name space kube-system (such as Kube-Proxy, Kube-DNS, kube-Dashboard)

Add-ons are categorized according to their use:

  • Add-on of Netwok policy . (For example the NSX-T add-on takes care of the communication between the K8s environment and VMware)
  • Add-on Infrastructural (For example KubeVirt which allows connection with virtual architectures)
  • Add-on of Visualization and Control (For example Dashboard a web interface for K8s).

For commissioning, Add-ons use controllers DaemonSet And Deployment .

The image in figure 1 summarizes what has just been explained.

Figure 1

Take care and see you soon.

Kubernets: Know the details

A good way to describe cloud-native environments is to refer to the image of your car.

The container is the engine, k8s is the electronic control unit that manages the proper functioning of the vehicle, the drivers, indicating the route and the destination, select the type of service to be provided.

Today’s article will reveal some architectural details to understand how “the car” manages to reach its destination in an efficient way.

Containers are of two types:

The first is called System Container. It is the bodywork of the car (I mean from the plates to seats, steering wheel, gear lever and accessories).

Often for simplicity of creation, it is a Virtual Machine (VM) with Linux operating system (it can also be Windows).

The most common services present in the VM are ssh , cron and syslog , the File System is of type ext3, ext4, etc.

The second type is called Application Container and is the place where the image will carry out the activities.

Note1: The image is not a single large file. They are usually multiple files which, through an internal cross-pointing system, allow the application to operate correctly.

The Container application (from now on container only), has an operating mode based on a rigid logic, where all levels (layers) have the peculiarity of communicating with each other and are interdependent.

Figure 1

This approach is very useful as it is able to manage the changes that may occur over time in an effective and hierarchical way.

Let’s take an example: When a service configuration change occurs, for which Layer C is updated, Layer A and B are not affected, which means that they must NOT be modified in turn.

Since Developers like to refine their own images (program files) rather than dependencies, it makes sense to set the service logic in the mode indicated in figure 2 where the dependencies are not affected by a new image.

Figure 2

Note2 : The File system on which the images are placed (in the example of the car engine we are talking about pistons, connecting rods, shafts …) is mainly of three different types:

  • Overlay
  • Overlay 2
  • AUFS

Note3 : A good advice on the security side is not to build the architecture so that the passwords are contained in the images ( Baked in – Cooked)

One of the splendid innovations introduced in the container world is the management of images:

In a classic high-reliability environment, the application is installed on every single node of the cluster.

In containers, the application is downloaded and deployed only when the workload requires more resources than a new cluster node with a new image.

For this reason, the images are saved in “virtual” warehouses, which can be local or distributed on the internet. They are called “Register Server”.

The most famous are Docker Hub, Google Container Registry, Amazon Elastic Container Registry, Azure Container Registry.

We conclude this article by talking about the management of resources associated with a service.

The container platform uses two features called Cgroup and NameSpace to allocate resources that work at the kernel level.

The purpose of the Cgroup is to assign the correct resources ( CPU & RAM ).

Name spaces have the purpose of grouping the different processes and making sure that they are isolated from each other ( Multitenancy ).

The type of NameSpace can affect all the components of the service as indicated in the list below.

  • Cgroup
  • PID
  • Users
  • Mount
  • Network
  • IPC (Interprocess communication)
  • UTS (allows a single system to appear with different host and domain names and with different processes, useful in case of migration)

An example of limiting the resources of an application is shown in Figure 3 where thegable image, downloaded from the Register Server grcgp, has a limit of RAM and CPU resources allocated.

Figure 3