# RKE2 Docs about RKE2 can be found [here](https://docs.rke2.io/). ## Install RKE2 There are two install types `agent` for pure worker nodes and `server` for management nodes. There need to be at least three `server` nodes. there can be any number of `agent` nodes. It is probably best to install the same version of Kubernetes as the cluster the node will be attached to. If needed the whole cluster can be upgraded after adding the nodes. ```bash curl -sfL https://get.rke2.io | INSTALL_RKE2_TYPE="agent" INSTALL_RKE2_VERSION=v1.xx.yy+rke2zz sudo -E sh - sudo mkdir -p /etc/rancher/rke2 sudo vi /etc/rancher/rke2/config.yaml ``` Create a `/etc/rancher/rke2/config.yaml` which will look something like: ```yaml node-name: leap-micro6 node-external-ip: - 10.3.6.55 node-ip: - 10.6.16.55 advertise-address: 10.6.16.55 # the following token can be found on 10.6.16.61 using # sudo cat /var/lib/rancher/rke2/server/node-token # comment this setting for the very first master node token: K10<...>::server:<...> # on all other servers after the initial setup on 10.6.16.61 is completed server: https://10.6.16.61:9345 cluster-cidr: 10.42.0.0/16 service-cidr: 10.43.0.0/16 cni: calico disable-kube-proxy: false etcd-expose-metrics: false etcd-snapshot-retention: 5 etcd-snapshot-schedule-cron: 0 */5 * * * kube-controller-manager-arg: - cert-dir=/var/lib/rancher/rke2/server/tls/kube-controller-manager - secure-port=10257 kube-controller-manager-extra-mount: - >- /var/lib/rancher/rke2/server/tls/kube-controller-manager:/var/lib/rancher/rke2/server/tls/kube-controller-manager kube-scheduler-arg: - cert-dir=/var/lib/rancher/rke2/server/tls/kube-scheduler - secure-port=10259 kube-scheduler-extra-mount: - >- /var/lib/rancher/rke2/server/tls/kube-scheduler:/var/lib/rancher/rke2/server/tls/kube-scheduler kubelet-arg: - max-pods=250 kubelet-extra-mount: - >- /lib/modules:/lib/modules node-label: - cattle.io/os=linux protect-kernel-defaults: false ``` Create a `/etc/rancher/rke2/registries.yaml` for using a local docker.io mirrir which will look something like: ```yaml mirrors: docker.io: endpoint: - "http://10.6.16.58:5000" ``` ### Server ```bash export PATH=$PATH:/opt/rke2/bin sudo systemctl enable rke2-server.service sudo systemctl start rke2-server.service sudo journalctl -u rke2-server -f ``` ### Agent ```bash export PATH=$PATH:/opt/rke2/bin sudo systemctl enable rke2-agent.service sudo systemctl start rke2-agent.service sudo journalctl -u rke2-agent -f ``` ## (Re)creating local storage provisioner volumes As any service using local storage should implement restroing missing data themselves this describes how to create just the empty volumes/disks to do that. As the local storage provisioner can not change the size of the volumes it will select the next larger volume for any claim. For example a 20 GB claim will select a 29.4 GiB volume, a 30 GB claim a 30.2 GiB volume etc. For flatcar-linux we can follow the advice on the sig-storage-local-static-provisioner website: Mount formatted block storage on `/mnt/local-disks/`. The UUID will be used to make sure mixing block devices will fail and not expose data to the wrong host. * Create a block device for an acdh-clusterX node in vCenter. Note that the size of the block device should be a little larger than the desired even number if GiB (example: for a 20 GiB volume create a 21 GiB disk) as there is a difference in how disk size is calculated * Format the volume on the respective flatcar node. Use ext4 or xfs depending on the needs of the service (for example elasticsearch/opensearch recommeds ext4) ```bash sudo mkfs.ext4 /dev/sdd ``` * reserved blocks for root are not very useful in kubernetes so set them to 0 ```bash sudo tune2fs -r 0 /dev/disk/by-uuid/ ``` * Get the UUID. It is part of the output of `mkfs.ext4` above. It is also for example available using using `ls -l /dev/disk/by-uuid/*` * Create a mount unit to mount the filesystem. The filename needs to match the mount point and is encoded. This will automatically create a `` directory in `/mnt/local-disks/` ```bash sudo cp /etc/systemd/system/var-lib-rancher.mount "/etc/systemd/system/$(systemd-escape --path /mnt/local-disks/).mount" sudo vi /etc/systemd/system/"$(systemd-escape --path /mnt/local-disks/).mount" # change directory name and device name # [Unit] # Description=Mount local storage at /mnt/local-disks/ # Before=local-fs.target # [Mount] # What=/dev/disk/by-uuid/ # Where=/mnt/local-disks/ # Type=ext4 or xfs # [Install] # WantedBy=local-fs.target sudo systemctl daemon-reload sudo systemctl enable "$(systemd-escape --path /mnt/local-disks/).mount" ``` ## Updating RKE2 This is best done using the Rancher UI for cluster updates. If the version there and the version on the nodes get out of sync _also all other settings cannot be changed anymore!_. But for reference here is the very simple method of following the stable release channel for RKE2: ```bash curl -sfL https://get.rke2.io | INSTALL_RKE2_CHANNEL=stable sudo -E sh - sudo systemctl restart rke2-agent # or sudo systemctl restart rke2-server ``` Repeat on each node after the last one is showing as up and active in Rancher. Start with the management/`server` nodes then update the `agent` nodes. [Here](https://update.rke2.io/v1-release/channels) you can see what version corresponds to stable at the moment. Kubernetes major versions are also channels. The channel latest refers to the very latest releases of K8s available. ## Troubleshooting ### Using command line tools to manually delete container images images ```bash sudo -s # as root export PATH=$PATH:/var/lib/rancher/rke2/bin export CONTAINERD_ADDRESS=/run/k3s/containerd/containerd.sock ctr -n k8s.io i rm $(ctr -n k8s.io i ls -q | grep ) # or export CONTAINER_RUNTIME_ENDPOINT=unix:///run/k3s/containerd/containerd.sock crictl images crictl rmi ```