Losing control to Kubernetes

GMK NucBox

Kubernetes is about giving up control. As someone who likes to understand what’s going on that’s made it hard for me to embrace it. I’ve also mostly been able to ignore it, which has helped. However I’m aware it’s incredibly popular, and there’s some infrastructure at work that uses it. While it’s not my responsibility I always find having an actual implementation of something is useful in understanding it generally, so I decided it was time to dig in and learn something new.

First up, I should say I understand the trade-off here about handing a bunch of decisions off to Kubernetes about the underlying platform allowing development/deployment to concentrate on a nice consistent environment. I get the analogy with the shipping container model where you can abstract out both sides knowing all you have to do is conform to the TEU API. In terms of the underlying concepts I’ve got some virtualisation and container experience, so I’m not coming at this as a complete newcomer. And I understand multi-site dynamically routed networks.

That said, let’s start with a basic goal. I’d like to understand k8s (see, I can be cool and use the short name) enough to be comfortable with what’s going on under the hood and be able to examine a running instance safely (i.e. enough confidence about pulling logs, probing state etc without fearing I might modify state). That’ll mean when I come across such infrastructure I have enough tools to be able to hopefully learn from it.

To do this I figure I’ll need to build myself a cluster and deploy some things on it, then poke it. I’ll start by doing so on bare metal; that removes variables around cloud providers and virtualisation and gives me an environment I know is isolated from everything else. I happen to have a GMK NucBox available, so I’ll use that.

As a first step I’m aiming to get a single node cluster deployed running some sort of web accessible service that is visible from the rest of my network. That should mean I’ve covered the basics of a Kubernetes install, a running service and actually making it accessible.

Of course I’m running Debian. I’ve got a Bullseye (Debian 11) install - not yet released as stable, but in freeze and therefore not a moving target. I wanted to use packages from Debian as much as possible but it seems that the bits of Kubernetes available in main are mostly just building blocks and not a great starting point for someone new to Kubernetes. So to do the initial install I did the following:

# Install docker + nftables from Debian
apt install docker.io nftables

# Add the Kubernetes repo and signing key
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg > /etc/apt/k8s.gpg
cat > /etc/apt/sources.list.d/kubernetes.list <<EOF
deb [signed-by=/etc/apt/k8s.gpg] https://apt.kubernetes.io/ kubernetes-xenial main
EOF
apt update
apt install kubelet kubeadm kubectl

That resulted in a 1.21.1-00 install, which is current at the time of writing. I then used kubeadm to create the cluster:

kubeadm init --apiserver-advertise-address 192.168.53.147 --apiserver-cert-extra-sans udon.mynetwork

The extra parameters were to make the API server externally accessible from the host. I don’t know if that was a good idea or not at this stage…

kubeadm spat out a bunch of instructions but the key piece was about copying the credentials to my user account. So I did:

mkdir ~noodles/.kube
cp -i /etc/kubernetes/admin.conf ~noodles/.kube/config
chown -R noodles ~noodles/.kube/

I then was able to see my pod:

noodles@udon:~$ kubectl get nodes
NAME   STATUS     ROLES                  AGE     VERSION
udon   NotReady   control-plane,master   4m31s   v1.21.1

Ooooh. But why’s it NotReady? Seems like it’s a networking issue and I need to install a networking provider. The documentation on this is appalling. Flannel gets recommended as a simple option but then turns out to need a --pod-network-cidr option passed to kubeadm and I didn’t feel like cleaning up and running again (I’ve omitted all the false starts it took me to get to this point). Another pointer was to Weave so I decided to try that with the following magic runes:

mkdir -p /var/lib/weave
head -c 16 /dev/urandom | shasum -a 256 | cut -d " " -f1 > /var/lib/weave/weave-passwd
kubectl create secret -n kube-system generic weave-passwd --from-file=/var/lib/weave/weave-passwd
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')&password-secret=weave-passwd&env.IPALLOC_RANGE=192.168.0.0/24"

(I believe what that’s doing is the first 3 lines create a password and store it into the internal Kubernetes config so the weave pod can retrieve it. The final line then grabs a YAML config from Weaveworks to configure up weave. My intention is to delve deeper into what’s going on here later; for now the primary purpose is to get up and running.)

As I’m running a single node cluster I then had to untaint my control node so I could use it as a worker node too:

kubectl taint nodes --all node-role.kubernetes.io/master-

And then:

noodles@udon:~$ kubectl get nodes
NAME   STATUS   ROLES                  AGE   VERSION
udon   Ready    control-plane,master   15m   v1.21.1

Result.

What’s actually running? Nothing except the actual system stuff, so we need to ask for all namespaces:

noodles@udon:~$ kubectl get pods --all-namespaces
NAMESPACE     NAME                           READY   STATUS    RESTARTS   AGE
kube-system   coredns-558bd4d5db-4nvrg       1/1     Running   0          18m
kube-system   coredns-558bd4d5db-flrfq       1/1     Running   0          18m
kube-system   etcd-udon                      1/1     Running   0          18m
kube-system   kube-apiserver-udon            1/1     Running   0          18m
kube-system   kube-controller-manager-udon   1/1     Running   0          18m
kube-system   kube-proxy-6d8kg               1/1     Running   0          18m
kube-system   kube-scheduler-udon            1/1     Running   0          18m
kube-system   weave-net-mchmg                2/2     Running   1          3m26s

These are all things I’m going to have to learn about, but for now I’ll nod and smile and pretend I understand.

Now I want to actually deploy something to the cluster. I ended up with a simple HTTP echoserver (though it’s not entirely clear that’s actually the source for what I ended up pulling):

$ kubectl create deployment hello-node --image=k8s.gcr.io/echoserver:1.10
deployment.apps/hello-node created
$ kubectl get pod
NAME                          READY   STATUS    RESTARTS   AGE
hello-node-59bffcc9fd-8hkgb   1/1     Running   0          36s
$ kubectl expose deployment hello-node --type=NodePort --port=8080
$ kubectl get services
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
hello-node   NodePort    10.107.66.138   <none>        8080:31529/TCP   1m

Looks good. And to test locally:

curl http://10.107.66.138:8080/

Hostname: hello-node-59bffcc9fd-8hkgb

Pod Information:
	-no pod information available-

Server values:
	server_version=nginx: 1.13.3 - lua: 10008

Request Information:
	client_address=192.168.53.147
	method=GET
	real path=/
	query=
	request_version=1.1
	request_scheme=http
	request_uri=http://10.107.66.138:8080/

Request Headers:
	accept=*/*
	host=10.107.66.138:8080
	user-agent=curl/7.74.0

Request Body:
	-no body in request-

Neat. But my external network is 192.168.53.0/24 and that’s a 10.* address so how do I actually make it visible to other hosts?

What I seem to need is an Ingress Controller which provide some sort of proxy between the outside world and pods within the cluster. Let’s pick nginx because at least I have some vague familiarity with that and it seems like it should be able to do a bunch of HTTP redirection to different pods depending on the incoming request.

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.46.0/deploy/static/provider/cloud/deploy.yaml

I then want to expose the hello-node to the outside world and I finally had to write some YAML:

cat > hello-ingress.yaml <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: example-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /$1
spec:
  rules:
    - host: udon.mynetwork
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: hello-node
                port:
                  number: 8080
EOF

i.e. incoming requests to http://udon.mynetwork/ should go to the hello-node on port 8080. I applied this:

$ kubectl apply -f hello-ingress.yaml
ingress.networking.k8s.io/example-ingress created
$ kubectl get ingress
NAME              CLASS    HOSTS            ADDRESS   PORTS   AGE
example-ingress   <none>   udon.mynetwork             80      3m8s

No address? What have I missed? Let’s check the nginx service, which apparently lives in the ingress-nginx namespace:

noodles@udon:~$ kubectl get services -n ingress-nginx
NAME                                 TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                    AGE
ingress-nginx-controller             LoadBalancer   10.96.9.41      <pending>     80:32740/TCP,443:30894/TCP 13h
ingress-nginx-controller-admission   ClusterIP      10.111.16.129   <none>        443/TCP                    13h

<pending> does not seem like something I want. Digging around it seems I need to configure the external IP. So I do:

kubectl patch svc ingress-nginx-controller -n ingress-nginx -p \
	'{"spec": {"type": "LoadBalancer", "externalIPs":["192.168.53.147"]}}'

and things look happier:

noodles@udon:~$ kubectl get services -n ingress-nginx
NAME                                 TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)                 AGE
ingress-nginx-controller             LoadBalancer   10.96.9.41      192.168.53.147   80:32740/TCP,443:30894/TCP   14h
ingress-nginx-controller-admission   ClusterIP      10.111.16.129   <none>           443/TCP                 14h
noodles@udon:~$ kubectl get ingress
NAME              CLASS    HOSTS           ADDRESS          PORTS   AGE
example-ingress   <none>   udon.mynetwork  192.168.53.147   80      14h

Let’s try a curl from a remote host:

curl http://udon.mynetwork/

Hostname: hello-node-59bffcc9fd-8hkgb

Pod Information:
	-no pod information available-

Server values:
	server_version=nginx: 1.13.3 - lua: 10008

Request Information:
	client_address=192.168.0.5
	method=GET
	real path=/
	query=
	request_version=1.1
	request_scheme=http
	request_uri=http://udon.mynetwork:8080/

Request Headers:
	accept=*/*
	host=udon.mynetwork
	user-agent=curl/7.64.0
	x-forwarded-for=192.168.53.136
	x-forwarded-host=udon.mynetwork
	x-forwarded-port=80
	x-forwarded-proto=http
	x-real-ip=192.168.53.136
	x-request-id=6aaef8feaaa4c7d07c60b2d05c45f75c
	x-scheme=http

Request Body:
	-no body in request-

Ok, so that seems like success. I’ve got a single node cluster running a single actual ‘application’ pod (the echoserver) and exporting it to the outside world. That’s enough to start poking under the hood. Which is for another post, as this one is already getting longer than I’d like. I’ll just leave some final thoughts of things I need to work out:

What’s going on with the networking?
Where’s the IPv6 (the host in question has native IPv6 routing)?
How do I deploy my own application pod?
I should look at a multiple host setup (i.e. a real cluster).
How much of this goes away if I use a cloud provider like AWS to run the cluster?
Can I do this with the Debian Kubernetes packages?