Home Deploy Kubernetes cluster with K3s and kubernetes dashboard for monitoring
Post
Cancel

Deploy Kubernetes cluster with K3s and kubernetes dashboard for monitoring

Introduction

In this document I am going to outline the steps that on how I have set up my kubernetes cluster along with an overview of issue I have faced during the installation and how I was able to fix them or workaround them. Once I have my cluster inplace, I have also deployed a light weight kubernetes dashboard to monitor my cluster as well as to test if my cluster is working as expected to host workloads.

My kubeenetes cluster has one master node and two worker nodes for high availability along with flannel CNI for pod to pod networking, Metal LB for load balancing and traefik wild card certificates with let’s encrypt(which is a document for another day).

K3s Installation

Installing k3s is pretty simple and straight forward. I just had to run three simple command on each of the nodes which I have outlined below. Starting with the master node,

1
2
3
4
5
curl -sfL https://get.k3s.io | sh -s - server \
  --cluster-init \
  --tls-san < MASTER_NODE_IP > \
  --disable servicelb \     ## disabling service lb as I want to use metal lb for load balancing.
  --disable traefik     ## disabling traefik as i will be installing it manually with my own configurations. 

Now the above command will install k3s on your master node along with all the components that are required for a master plan like etcd, control plane etc. Now, to test it, I ran the following command

1
kubectl get nodes -o wide

Now that I have my master node working, I am going to get a node token (which is required for joining other worker nodes) and ran the following commands

1
sudo cat /var/lib/rancher/k3s/server/node-token  ## Get the token that is generated by master node as this token is required for worker nodes to join with the main master node. 

Now, I ran the following on both of my worker nodes.

1
curl -sfL https://get.k3s.io | K3S_URL=https://<worker_node_ip>:6443 K3S_TOKEN=<TOKEN> sh -

One of the main advantage of using k3s for my kubernetes cluster is the flexibility to add more master nodes or worker nodes in the future with just a single command, to support more heavy workloads. To add an another master node, I just have to run the following command

1
2
3
4
5
6
7
curl -sfL https://get.k3s.io | sh -s - server \
  --server https://<MASTER_NODE_IP>:6443 \
  --token <TOKEN> \
  --tls-san <MASTER_NODE_IP> \
  --tls-san <YOUR_MAC_VM_IP> \
  --disable servicelb \
  --disable traefik

Now, hopefully if there were no issues, when I ran the following command,

1
kubectl get nodes -o wide

output should look something like:

1
2
3
4
NAME                STATUS   ROLES                       AGE     VERSION        INTERNAL-IP     EXTERNAL-IP   OS-IMAGE       KERNEL-VERSION      CONTAINER-RUNTIME
master-node-01      Ready    control-plane,etcd,master   3d15h   v1.31.4+k3s1   <>              <none>        Ubuntu 24.10   6.11.0-14-generic   containerd://<version>
worker-node-01      Ready    <none>                      3d14h   v1.31.4+k3s1   <>              <none>        Ubuntu 24.10   6.11.0-14-generic   containerd://<version>
worker-node-02      Ready    <none>                      3d14h   v1.31.4+k3s1   <>              <none>        Ubuntu 24.10   6.11.0-14-generic   containerd://<version>

Metal LB configuration

For my load balancing, I have used MetalLB with the following configuration,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: default-pool
  namespace: metallb-system
spec:
  addresses:
  - <IP-ADDRESS-RANGE>  # Choose an unused IP range in your network

---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: advert
  namespace: metallb-system

The kind IPAddressPool is used to specify for metal lb for which IPs to use and assign for the workloads. All traffic passes through this metal lb which is the entry point for all traffic to my kubernetes cluster. The kind L2Advertisement enables Layer 2 mode in MetalLB to advertize assigned LoadBalaner range of IPs for the MetalLB via ARP for IPv4 and NDP for IPv6

When a LoadBalancer service is created, MetalLB assignes an IP address from the specified IPAddressPool Now, L2Advertisement announces this assigned IP address so that other devices or services in the network. If MetalLB is not advertising LoadBalancer IPs, adding an L2Advertisement ensures that MetalLB properly announces them over ARP/NDP.

If MetalLB is not advertising LoadBalancer IPs, adding an L2Advertisement ensures that MetalLB properly announces them over ARP/NDP. If your LoadBalancer IP is not reachable, this configuration could be missing.

MetalLB vs Load Balancer (Traefik)

MetalLB is NOT a Load Balancer itself (in the traditional sense). It is a Service LoadBalancer implementation for Kubernetes in bare-metal environments. Kubernetes LoadBalancer services require an external load balancer to assign public IPs, but bare-metal clusters don’t have cloud provider integration. MetalLB fills this gap by providing IPs to LoadBalancer services inside your cluster. This ensures external traffic can reach services using LoadBalancer-type Kubernetes services. I have configured MetalLB to allocate external IPs using an IPAddressPool (e.g., 192.168.1.1). The L2Advertisement ensures that MetalLB announces these IPs using Layer 2 networking (ARP/NDP).

Traefik is an Ingress Controller & Reverse Proxy and it acts as an application-layer load balancer for HTTP/S traffic.

Purpose of Traefik: • Traefik routes incoming HTTP/S traffic to services within your cluster based on domain names & paths. • It uses Kubernetes Ingress resources to map incoming requests to backend services. • Unlike MetalLB, Traefik can handle TLS termination, rate limiting, and path-based routing.

In my environment, Traefik is running as a Kubernetes LoadBalancer service, meaning it needs an external IP. MetalLB assigns this external IP (192.168.3.225) so it can be reached from outside the cluster. I have defined Ingress resources to route traffic through Traefik.

Without MetalLB (Cloud Example), In a cloud environment (e.g., AWS, GCP, Azure), a LoadBalancer Service would get a cloud-managed external IP. Kubernetes services marked as LoadBalancer automatically receive external access.

With MetalLB (Bare-Metal Kubernetes), MetalLB provides the missing piece by assigning external IPs in your local network. Traefik acts as the application-layer proxy, while MetalLB exposes Traefik to the outside world.

1
2
NAME      TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)                                    AGE
traefik   LoadBalancer   <CLUSTER_IP>   <EXTERNAL-IP>   80:30698/TCP,443:31562/TCP,443:31562/UDP   3d5h

Issues I have faced and Troubleshooting.

Taint and Tolerations

The first issue i have faced when installing k3s on my master node is taint issue. Taint prevents pods from scheduling on it. Usually taint is used to manage node level constraints and to ensure that workloads run on right resources. For example Consider a three-node cluster that consists of a master node and two worker nodes (node A and node B). Worker node A has an attached GPU, making it expensive but well suited for GPU-accelerated workloads while the worker node B is general purpose node without an attached GPU.

By adding a taint to the worker node A and only giving the necessary toleration to the GPU specific workloads, administrators can ensure that Kubernetes will only schedule workloads that can be accelerated with a GPU in the worker node A. All other pods will be scheduled on our general purpose worker node B. This technique ensures that valuable GPU capacity isn’t wasted in the underlying hardware by the wrong types of workloads.

For more information regarding Kubernetes Taint & Tolerations: Blog Documentation

I had two options with regards to taint 1. To remove the taint entirely or 2. To add toleration to the workload that needed to run on control plane. I chose to add toleration instead of removing taint to preserve the hardware from overloading with workloads. I added the contraints using the following:

1
2
3
4
5
spec:
  tolerations:
  - key: "node-role.kubernetes.io/control-plane"
    operator: "Exists"
    effect: "NoSchedule"

Now tghe above configuration allows the pods that explicitly allowing this taint can be scheduled on control plane nodes. Verify the scheduling of pods using the follwoing kubectl command

1
kubectl get pods -o wide

Networking Issues

This tripped me up for a second as k3s natively doesn’t offer any pod to pod network communications. I had to install flannel manually. Also, I had to make sure that master ndoe and the API server are reachable. Furthermore, I also had to make sure UFW, if enabled, is having proper rules to allow TCP traffic on port 6443.

For General troubleshooting for networks, I would start by describing the pods to see if there are any issues while creating the pod then I would get the logs of the pod for any specific errors like unreachable api server or something. We can determine if it is a networking issue by looking at the logs and describing the pods. We can also look the services for any networking issues and resolve issues based on the logs.

1
2
3
4
5
kubectl describe pod <POD_NAME> -n <NAMESPACE_NAME>

kubectl log <POD_NAME>

kubectl get svc

kube-system namespace has all the workload regarding the k3s and its DNS, We can restart the pods initially if we have any issues. If the issue still persists, proceed with the aforementioned steps for troubleshooting.

When I first setup my master node, I wasn’t able to access the token located at /var/lib/rancher/k3s/server/node-token because of the permissions issue. I had to add 600 permissions to the node-token file and do a chown on the same. Samething with the kubernetes config file as well located at /etc/rancher/k3s/k3s.yaml

For remote access to my kubernetes cluster, I had to copy the kubeconfig file from the above mentioned location to my local environment. Initially, I got API server not reachable error as that file has api server mentioned as https://127.0.0.1:6443 I have to change this api server to my master nodes ip address as the request has to reach my master node where my api server is located.

Install Kubernetes Dashboard for cluster monitoring.

Dashboard is a web-based Kubernetes user interface. You can use Dashboard to deploy containerized applications to a Kubernetes cluster, troubleshoot your containerized application, and manage the cluster resources. You can use Dashboard to get an overview of applications running on your cluster, as well as for creating or modifying individual Kubernetes resources (such as Deployments, Jobs, DaemonSets, etc). For example, you can scale a Deployment, initiate a rolling update, restart a pod or deploy new applications using a deploy wizard.

Dashboard also provides information on the state of Kubernetes resources in your cluster and on any errors that may have occurred. For more information on this, refer the following document Kubernetes Dashboard

Kubernetes dashboard can be installed using helm, as outlined in the documentation file.

1
2
3
4
# Add kubernetes-dashboard repository
helm repo add kubernetes-dashboard https://kubernetes.github.io/dashboard/
# Deploy a Helm Release named "kubernetes-dashboard" using the kubernetes-dashboard chart
helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard --create-namespace --namespace kubernetes-dashboard

As it is a helm deployment, we can customize the values file. All the customizations for helm are available in the kubernetes dashboard documentation. The followiwng are the values that I have customized.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
app:
  mode: 'dashboard'
  image:
    pullPolicy: IfNotPresent
    pullSecrets: []
  ingress:
    enabled: true
    hosts:
      - <CUSTOM-DOMAIN-NAME>
    ingressClassName: TRAEFIK-SERVICE-NAME>
    useDefaultIngressClass: false
    useDefaultAnnotations: true
    pathType: ImplementationSpecific
    path: /
    issuer:
      name: selfsigned
      scope: default
    tls:
      enabled: true
      secretName: <SELFSIGNED-CERT-SECRET-NAME>
    labels: {}
    annotations:
      kubernetes.io/ingress.class: <TRAEFIK-SERVICE-NAME>
      traefik.ingress.kubernetes.io/router.entrypoints: websecure
      traefik.ingress.kubernetes.io/router.tls: "true"
      traefik.ingress.kubernetes.io/router.middlewares: default-headers@kubernetescrd
  tolerations: []
  affinity: {}

The helm chart will create a new namespace for hosting the kubernetes dashboard. The helm chart will create 5 pods in the kubernetes dashboard.

1
2
3
4
5
6
NAME                                         READY    STATUS   RESTARTS     AGE
kubernetes-dashboard-api-<ID>                 1/1     Running   0          2d16h
kubernetes-dashboard-auth-<ID>                1/1     Running   0          2d16h
kubernetes-dashboard-kong-<ID>                1/1     Running   0          12s
kubernetes-dashboard-metrics-scraper-<ID>     1/1     Running   0          2d16h
kubernetes-dashboard-web-<ID>                 1/1     Running   0          2d16h

Once you have the dashboard up and running, we would need to create a user and a token. In the older versions of the dashboard, kubeconfig can also be used to access but that feature has been removed from the newer versions of the dashboard. Refer the follwoing markdown file for creating a admin user and access the dashboard. Access to Dashboard. The following is the configuration that I have used to create an admin user and creating a token.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
apiVersion: v1
kind: ServiceAccount
metadata:
  name: <SERVICE-ACCOUNT-NAME/USER-NAME>
  namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: <SERVICE-ACCOUNT-NAME/USER-NAME>
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: <ROLE-NAME>
subjects:
- kind: ServiceAccount
  name: <SERVICE-ACCOUNT-NAME/USER-NAME>
  namespace: kubernetes-dashboard

In most cases, a token should be created automatically, which can be accessed using the following commands. If a token is not created by default, then we could create a token.

1
2
3
4
5
6
7
8
9
## Get the service accounts of the namespace
kubectl get serviceaccount -n kubernetes-dashboard
## Get the secrets in the same namespace
kubectl get secret -n kubernetes-dashboard
## Get the token from the information retrieved from the above commands
kubectl get secret <SECRET_NAME> -n kubernetes-dashboard -o jsonpath="{.data.token}" | base64 --decode

## Create a token to access the dashboard.
kubectl create token <SERVICE-ACCOUNTP-NAME/USER-NAME> -n kubernetes-dashboard

Token would look something like

1
eyJhbGciOiJSUzI1NiIsInR5cCIgOiAiSldU...

IngressRoute

For accessing the kubernetes dashboard using the custom domain that we have specified, I have to create an ingressroute, which looks something like this.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
  annotations:
    kubernetes.io/ingress.class: <TRAEFIK-SERVICE-NAME>
spec:
  entryPoints:
    - websecure
  routes:
    - match: Host(`<DASHBOARD-CUSTOM-DOMAIN>`)
      kind: Rule
      services:
        - name: <KUBERNETES-KONG-POD-NAME>  # Ensure this matches your service
          port: 443
      middlewares:
        - name: default-headers
  tls:
    secretName: <TRAEFIK-CERTIFICATE-NAME>
This post is licensed under CC BY 4.0 by the author.