Kubernetes Day 1
Enough cribbing about tooling, let’s run some containers!
Archetype
At $DAYJOB I maintain a lot of Python code, so to make everything transferrable I’m going to create a Python package called archetype
which is your archetypical Azure Python REST service.
The only design constraint is that archetype
must not do anything interesting!
WYSIWYG, so to speak.
Its source code can be found here, and for my sanity I’ll document a little of its structure here:
scripts
directory holds the project scripts according to the scripts-to-rule-them-all pattern.pyproject.toml
defines metadata for the project. This file is involved when packaging this into a wheel (a python package) and if I was publishing this on PyPI then this file would drive the metadata that others see about this package.src
holds the source code of the project. By default, the build backend Hatchling will use either these files for wheels, and these files for source distributions. This might be very important to understand if I was publishing on PyPI but for deploying services it seems not too important.requirements.txt
anddev-requirements.txt
are generated bypip-compile
usingpyproject.toml
and are not meant to be edited by hand. They are checked in and modified whenever the dependency set of the project changes.Dockerfile
simply copies the code, runssetup
to install dependencies and install the package as an editable install, and then runs the project usinguvicorn
as the server.
This is the output I see when running the project:
❯ ./scripts/server
INFO: Will watch for changes in these directories: ['/Users/gustavo/code/ghidalgo3.github.io/k8s']
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO: Started reloader process [39552] using StatReload
Importing archetype module!
INFO: Started server process [39554]
INFO: Waiting for application startup.
INFO: Application startup complete.
Dockerizing Archetype
Dockerizing this project is pretty straight forward:
FROM python:3.11-bookworm
WORKDIR /app
COPY . /app
RUN scripts/setup
ENTRYPOINT [ "uvicorn", "archetype.main:app", "--host=0.0.0.0"]
Line by line:
- Start with an image with Python pre-installed,
python:3.11-bookwork
- Create a working directory
/app
- Copy everything under the Dockerfile’s path to
/app
- Run the same
scripts/setup
scripts inside the container. This install our dependencies! - Leave a declared entry point for
uvicorn
to run its server.
Then I publish this container to a private ACR with docker push -t "gustavo2acr.azurecr.io/archetype:latest"
. Pretty simple!
ARM?
Eventually I’ll come to find out that docker files are built for specific ISAs!
When I build images on my Mac, the image is an ARM build, not an x86 build.
If I want the image as is to run on the Kubernetes nodes I created, I would have needed to create a cluster with ARM CPU VMs.
The solution is simply to ask docker to build an x86 image with --platform=linux/amd64
in the docker build
call.
Deploy this application to AKS.
I’ll do this in the most verbose way possible first to motivate using Helm.
Verbose mandates tool purity, so kubectl
is my only choice.
To authenticate againt the cluster, I simply run:
az aks get-credentials \
--resource-group $(terraform output -raw resource_group_name) \
--name $(terraform output -raw k8s_cluster_name)
Double checking that worked…
❯ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 192.168.0.1 <none> 443/TCP 20h
Cool, I have a configured kubectl
.
Building this from the bottom up, my understanding is that I need 3 kubernetes resources:
- A
Deployment
object - A
Service
object - A
Ingress
object
Deployment
According to the docs, a Deployment resource declaratively controls Pods and ReplicaSets. I think since 90% of all kubenetes applications just need the cluster to make sure that N replicas of the application are running, deployments were created to satisfy this common use case.
Here’s our deployment definition:
apiVersion: apps/v1
kind: Deployment
metadata:
name: archetype-deployment
labels:
app: archetype
spec:
replicas: 2
selector:
matchLabels:
app: archetype
template:
metadata:
labels:
app: archetype
spec:
containers:
- name: archetype
image: gustavo2acr.azurecr.io/archetype:latest-amd64
ports:
- containerPort: 8000
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
Then after applying it:
> kubectl describe deployment
Name: archetype-deployment
Namespace: default
CreationTimestamp: Sun, 25 Feb 2024 12:35:33 -0500
Labels: app=archetype
Annotations: deployment.kubernetes.io/revision: 1
Selector: app=archetype
Replicas: 2 desired | 2 updated | 2 total | 0 available | 2 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=archetype
Containers:
archetype:
Image: gustavo2acr.azurecr.io/archetype:latest-amd64
Port: 8000/TCP
Host Port: 0/TCP
Limits:
cpu: 500m
memory: 128Mi
Requests:
cpu: 250m
memory: 64Mi
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available False MinimumReplicasUnavailable
Progressing True ReplicaSetUpdated
OldReplicaSets: <none>
NewReplicaSet: archetype-deployment-75cc984b68 (2/2 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 3m23s deployment-controller Scaled up replica set archetype-deployment-75cc984b68 to 2
Resource Limits
I’m cheap, so my cluster only has 1 node and it’s a little 2-core Azure VM. It’s remarkable how many pods are running by default in a cluster doing nothing:
❯ kubectl describe node
Name: aks-default-42366886-vmss000000
Roles: agent
Labels: agentpool=default
beta.kubernetes.io/arch=amd64
beta.kubernetes.io/instance-type=Standard_DS2_v2
beta.kubernetes.io/os=linux
failure-domain.beta.kubernetes.io/region=eastus2
failure-domain.beta.kubernetes.io/zone=0
kubernetes.azure.com/agentpool=default
kubernetes.azure.com/cluster=MC_gustavo2-rg_gustavo2-aks-cluster_eastus2
kubernetes.azure.com/consolidated-additional-properties=91882a8a-d34a-11ee-bd45-2a30a704d842
kubernetes.azure.com/kubelet-identity-client-id=abb742dc-7a98-47b0-8d03-cc3d662c02c6
kubernetes.azure.com/mode=system
kubernetes.azure.com/node-image-version=AKSUbuntu-2204gen2containerd-202402.07.0
kubernetes.azure.com/nodepool-type=VirtualMachineScaleSets
kubernetes.azure.com/os-sku=Ubuntu
kubernetes.azure.com/role=agent
kubernetes.azure.com/storageprofile=managed
kubernetes.azure.com/storagetier=Premium_LRS
kubernetes.io/arch=amd64
kubernetes.io/hostname=aks-default-42366886-vmss000000
kubernetes.io/os=linux
kubernetes.io/role=agent
node-role.kubernetes.io/agent=
node.kubernetes.io/instance-type=Standard_DS2_v2
storageprofile=managed
storagetier=Premium_LRS
topology.disk.csi.azure.com/zone=
topology.kubernetes.io/region=eastus2
topology.kubernetes.io/zone=0
Annotations: csi.volume.kubernetes.io/nodeid:
{"disk.csi.azure.com":"aks-default-42366886-vmss000000","file.csi.azure.com":"aks-default-42366886-vmss000000"}
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Sat, 24 Feb 2024 14:28:09 -0500
Taints: <none>
Unschedulable: false
Lease:
HolderIdentity: aks-default-42366886-vmss000000
AcquireTime: <unset>
RenewTime: Sun, 25 Feb 2024 13:06:03 -0500
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
ContainerRuntimeProblem False Sun, 25 Feb 2024 13:02:16 -0500 Sat, 24 Feb 2024 14:29:40 -0500 ContainerRuntimeIsUp container runtime service is up
VMEventScheduled False Sun, 25 Feb 2024 13:02:16 -0500 Sat, 24 Feb 2024 14:29:40 -0500 NoVMEventScheduled VM has no scheduled event
FrequentDockerRestart False Sun, 25 Feb 2024 13:02:16 -0500 Sat, 24 Feb 2024 14:29:40 -0500 NoFrequentDockerRestart docker is functioning properly
ReadonlyFilesystem False Sun, 25 Feb 2024 13:02:16 -0500 Sat, 24 Feb 2024 14:29:40 -0500 FilesystemIsNotReadOnly Filesystem is not read-only
FrequentKubeletRestart False Sun, 25 Feb 2024 13:02:16 -0500 Sat, 24 Feb 2024 14:29:40 -0500 NoFrequentKubeletRestart kubelet is functioning properly
KubeletProblem False Sun, 25 Feb 2024 13:02:16 -0500 Sat, 24 Feb 2024 14:29:40 -0500 KubeletIsUp kubelet service is up
FilesystemCorruptionProblem False Sun, 25 Feb 2024 13:02:16 -0500 Sat, 24 Feb 2024 14:29:40 -0500 FilesystemIsOK Filesystem is healthy
KernelDeadlock False Sun, 25 Feb 2024 13:02:16 -0500 Sat, 24 Feb 2024 14:29:40 -0500 KernelHasNoDeadlock kernel has no deadlock
FrequentUnregisterNetDevice False Sun, 25 Feb 2024 13:02:16 -0500 Sat, 24 Feb 2024 14:29:40 -0500 NoFrequentUnregisterNetDevice node is functioning properly
FrequentContainerdRestart False Sun, 25 Feb 2024 13:02:16 -0500 Sat, 24 Feb 2024 14:29:40 -0500 NoFrequentContainerdRestart containerd is functioning properly
MemoryPressure False Sun, 25 Feb 2024 13:04:21 -0500 Sat, 24 Feb 2024 14:28:09 -0500 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Sun, 25 Feb 2024 13:04:21 -0500 Sat, 24 Feb 2024 14:28:09 -0500 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Sun, 25 Feb 2024 13:04:21 -0500 Sat, 24 Feb 2024 14:28:09 -0500 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Sun, 25 Feb 2024 13:04:21 -0500 Sat, 24 Feb 2024 14:28:10 -0500 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 10.0.1.4
Hostname: aks-default-42366886-vmss000000
Capacity:
cpu: 2
ephemeral-storage: 129886128Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 7097684Ki
pods: 30
Allocatable:
cpu: 1900m
ephemeral-storage: 119703055367
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 4652372Ki
pods: 30
System Info:
Machine ID: 91d02b7297d74d2290142148194c2bec
System UUID: e0c121d3-1932-4eb1-b50f-fbec6960eccd
Boot ID: 36431874-31af-4b99-84d5-11c14da820a8
Kernel Version: 5.15.0-1054-azure
OS Image: Ubuntu 22.04.3 LTS
Operating System: linux
Architecture: amd64
Container Runtime Version: containerd://1.7.7-1
Kubelet Version: v1.27.9
Kube-Proxy Version: v1.27.9
ProviderID: azure:///subscriptions/394dccd4-6f06-4003-8c9a-671ba0d665c7/resourceGroups/mc_gustavo2-rg_gustavo2-aks-cluster_eastus2/providers/Microsoft.Compute/virtualMachineScaleSets/aks-default-42366886-vmss/virtualMachines/0
Non-terminated Pods: (16 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
app-routing-system nginx-5d4cbcf56b-qwkn4 500m (26%) 0 (0%) 127Mi (2%) 0 (0%) 76m
app-routing-system nginx-5d4cbcf56b-rkrx4 500m (26%) 0 (0%) 127Mi (2%) 0 (0%) 77m
default archetype-deployment-69945d5cf9-klrff 0 (0%) 0 (0%) 0 (0%) 0 (0%) 109s
default archetype-deployment-69945d5cf9-wnh7p 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3m28s
kube-system azure-ip-masq-agent-lbpkh 100m (5%) 500m (26%) 50Mi (1%) 250Mi (5%) 22h
kube-system cloud-node-manager-qbxww 50m (2%) 0 (0%) 50Mi (1%) 512Mi (11%) 22h
kube-system coredns-789789675-nn9qz 100m (5%) 3 (157%) 70Mi (1%) 500Mi (11%) 22h
kube-system coredns-789789675-ss2hf 100m (5%) 3 (157%) 70Mi (1%) 500Mi (11%) 22h
kube-system coredns-autoscaler-649b947bbd-xr74t 20m (1%) 200m (10%) 10Mi (0%) 500Mi (11%) 22h
kube-system csi-azuredisk-node-69clt 30m (1%) 0 (0%) 60Mi (1%) 400Mi (8%) 22h
kube-system csi-azurefile-node-mwcpv 30m (1%) 0 (0%) 60Mi (1%) 600Mi (13%) 22h
kube-system konnectivity-agent-7565b9c9b8-bmjb2 20m (1%) 1 (52%) 20Mi (0%) 1Gi (22%) 22h
kube-system konnectivity-agent-7565b9c9b8-t55pk 20m (1%) 1 (52%) 20Mi (0%) 1Gi (22%) 22h
kube-system kube-proxy-nct72 100m (5%) 0 (0%) 0 (0%) 0 (0%) 22h
kube-system metrics-server-5bd48455f4-6cmbn 50m (2%) 145m (7%) 85Mi (1%) 355Mi (7%) 22h
kube-system metrics-server-5bd48455f4-n8km8 50m (2%) 145m (7%) 85Mi (1%) 355Mi (7%) 22h
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 1670m (87%) 8990m (473%)
memory 834Mi (18%) 6020Mi (132%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
The interesting bit is that the VM has a CPU capacity of 1900m
(about 2 cores) and the current requested CPU consumption is 1670m
CPU from all the pods (that I didn’t create BTW!).
The deployment requests to deploy 2 archetype
pods that require 500m
CPU each so naturally Kubernetes cannot schedule them because 1670m + 2 * 500m > 1900m
.
I could create more VMs to increase cluster resourcing but I’m cheap so instead I’ll remove the resource request and limits from the pod spec.
Maybe this is dangerous in production but for my learning purposes I don’t really care about violating resource constraints.
Service
The service definition looks like this:
apiVersion: v1
kind: Service
metadata:
name: archetype-service
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 8000
selector:
app: archetype
Note that the spec.selector.app
of the service matches spec.template.metadata.labels.app
of the deployment.
This is how kubernetes finds archetype
pods to expose as the archetype-service
.
If the labels don’t match, then the pods simply aren’t exposed!
❯ kubectl describe service archetype-service
Name: archetype-service
Namespace: default
Labels: <none>
Annotations: <none>
Selector: app=archetype
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 192.168.90.238
IPs: 192.168.90.238
Port: <unset> 80/TCP
TargetPort: 8000/TCP
Endpoints: <none>
Session Affinity: None
Events: <none>
ClusterIP
services are only available inside the cluster.
That means, a pod within the cluster can communicate with the service either at IP address 192.168.90.238
or at address archetype
.
The outside world cannot reach this service!
For that we need to define an Ingress
object.
Ingress
This is where the Azure documentation falls apart.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: archetype-ingress
spec:
ingressClassName: webapprouting.kubernetes.azure.com
rules:
- host: archetype.gustavohidalgo.com
http:
paths:
- backend:
service:
name: archetype-service
port:
number: 80
path: /
pathType: Prefix
Looking at the limitations:
All global Azure DNS zones integrated with the add-on have to be in the same resource group.
Ok that’s kind of a stupid limitation.
I already have a DNS Zone resource that I created long ago for gustavohidalgo.com
in a different resource group and I don’t know if I need to or should create a new DNS zone.
Maybe we don’t need to create a DNS zone, because actually this worked:
❯ curl 4.152.12.223/ -H "Host: archetype.gustavohidalgo.com"
{"Hello":"NAMELESS!"}
And the address 4.152.12.223
was allocated either when I enabled the app routing feature or when I created the ingress.
Who knows, AKS definitely doesn’t tell you when new billable resources are created under your nose!
Conclusion
After all of this, I have a dockerized Python application running in a Kubenetes cluster. Cool, but I still feel like AKS is overly complicated.