Showing posts with label kubeadm. Show all posts
Showing posts with label kubeadm. Show all posts

Saturday, April 8, 2023

vSphere with Tanzu using NSX-T - Part24 - Kubernetes component certs in TKC

The Kubernetes component certificates inside a TKC (Tanzu Kubernetes Cluster) has lifetime of 1 year. If you manage to upgrade your TKC atleast once a year, these certs will get rotated automatically. 

 

IMPORTANT NOTES: 

  • As per this VMware KB, if TKGS Guest Cluster certificates are expired, you will need to engage VMware support to manually rotate them.  
  • Following troubleshooting steps and workaround are based on studies conducted on my dev/ test/ lab setup, and I will NOT recommend anyone to follow these on your production environment.

 

Symptom:

KUBECONFIG=tkc.kubeconfig kubectl get nodes
Unable to connect to the server: x509: certificate has expired or is not yet valid

 

Troubleshooting:

  • Verify the certificate expiry of the tkc kubeconfig file itself.
❯ grep client-certificate-data tkc.kubeconfig | awk '{print $2}' | base64 -d | openssl x509 -noout -dates
notBefore=Mar  8 18:10:15 2022 GMT
notAfter=Mar  7 18:26:10 2024 GMT
  • Create a jumpbox pod and ssh to TKC control plane nodes.
  • Verify system pods and check logs from apiserver and etcd pods. Sample etcd pod logs are given below:
2023-04-11 07:09:00.268792 W | rafthttp: health check for peer b5bab7da6e326a7c could not connect: x509: certificate has expired or is not yet valid: current time 2023-04-11T07:08:57Z is after 2023-04-06T06:17:56Z
2023-04-11 07:09:00.268835 W | rafthttp: health check for peer b5bab7da6e326a7c could not connect: x509: certificate has expired or is not yet valid: current time 2023-04-11T07:08:57Z is after 2023-04-06T06:17:56Z
2023-04-11 07:09:00.268841 W | rafthttp: health check for peer 19b6b0bf00e81f0b could not connect: remote error: tls: bad certificate
2023-04-11 07:09:00.268869 W | rafthttp: health check for peer 19b6b0bf00e81f0b could not connect: remote error: tls: bad certificate
2023-04-11 07:09:00.310030 I | embed: rejected connection from "172.31.20.27:35362" (error "remote error: tls: bad certificate", ServerName "")
2023-04-11 07:09:00.312806 I | embed: rejected connection from "172.31.20.27:35366" (error "remote error: tls: bad certificate", ServerName "")
2023-04-11 07:09:00.321449 I | embed: rejected connection from "172.31.20.19:35034" (error "remote error: tls: bad certificate", ServerName "")
2023-04-11 07:09:00.322192 I | embed: rejected connection from "172.31.20.19:35036" (error "remote error: tls: bad certificate", ServerName "")
  • Verify whether admin.conf inside the control plane node has expired.
root [ /etc/kubernetes ]# grep client-certificate-data admin.conf | awk '{print $2}' | base64 -d | openssl x509 -noout -dates
notBefore=Mar  8 18:10:15 2022 GMT
notAfter=Apr  6 06:05:46 2023 GMT
  • Verify Kubernetes component certs in all the control plane nodes.
root [ /etc/kubernetes ]# kubeadm certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[check-expiration] Error reading configuration from the Cluster. Falling back to default configuration

CERTIFICATE                EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   EXTERNALLY MANAGED
admin.conf                 Apr 06, 2023 06:05 UTC   <invalid>                               no
apiserver                  Apr 06, 2023 06:05 UTC   <invalid>       ca                      no
apiserver-etcd-client      Apr 06, 2023 06:05 UTC   <invalid>       etcd-ca                 no
apiserver-kubelet-client   Apr 06, 2023 06:05 UTC   <invalid>       ca                      no
controller-manager.conf    Apr 06, 2023 06:05 UTC   <invalid>                               no
etcd-healthcheck-client    Apr 06, 2023 06:05 UTC   <invalid>       etcd-ca                 no
etcd-peer                  Apr 06, 2023 06:05 UTC   <invalid>       etcd-ca                 no
etcd-server                Apr 06, 2023 06:05 UTC   <invalid>       etcd-ca                 no
front-proxy-client         Apr 06, 2023 06:05 UTC   <invalid>       front-proxy-ca          no
scheduler.conf             Apr 06, 2023 06:05 UTC   <invalid>                               no

CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
ca                      Mar 05, 2032 18:15 UTC   8y              no
etcd-ca                 Mar 05, 2032 18:15 UTC   8y              no
front-proxy-ca          Mar 05, 2032 18:15 UTC   8y              no

 

Workaround:

  • Renew Kubernetes component certs on control plane nodes if expired using kubeadm certs renew all.
root [ /etc/kubernetes ]# kubeadm certs renew all
[renew] Reading configuration from the cluster...
[renew] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[renew] Error reading configuration from the Cluster. Falling back to default configuration

certificate embedded in the kubeconfig file for the admin to use and for kubeadm itself renewed
certificate for serving the Kubernetes API renewed
certificate the apiserver uses to access etcd renewed
certificate for the API server to connect to kubelet renewed
certificate embedded in the kubeconfig file for the controller manager to use renewed
certificate for liveness probes to healthcheck etcd renewed
certificate for etcd nodes to communicate with each other renewed
certificate for serving etcd renewed
certificate for the front proxy client renewed
certificate embedded in the kubeconfig file for the scheduler manager to use renewed

Done renewing certificates. You must restart the kube-apiserver, kube-controller-manager, kube-scheduler and etcd, so that they can use the new certificates.

 

Verify:

  • Verify using the following steps on all the TKC control plane nodes.
root [ /etc/kubernetes ]# grep client-certificate-data admin.conf | awk '{print $2}' | base64 -d | openssl x509 -noout -dates

root [ /etc/kubernetes ]# kubeadm certs check-expiration

  • Try connect to the TKC using tkc.kubeconfig.
KUBECONFIG=tkc.kubeconfig kubectl get node

Hope it was useful. Cheers!

References: 

Saturday, February 15, 2020

Kubernetes 101 - Part1 - Create K8s cluster with kubeadm


Deploy 3 CentOS 7 VMs. One will be the master and the other two will be workers/ slaves.
Master is "m1" and workers are "w1" and "w2".

Plan IP address
192.168.105.100 - m1
192.168.105.101 - w1
192.168.105.102 - w2
Step1: on all 3 nodes
vi /etc/hosts
192.168.105.100 m1
192.168.105.101 w1
192.168.105.102 w2
Step2: on all 3 nodes
#Disable firewall
sudo firewall-cmd --state
sudo systemctl stop firewalld
sudo systemctl disable firewalld
sudo systemctl status firewalld
setenforce 0
sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux
Step3: on all 3 nodes
#Configure iptables for Kubernetes
cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system
Step4: on all 3 nodes
#Disable swap
swapoff -a
vi /etc/fstab (Edit fstab file and comment(#) swap partition)

reboot all 3 nodes
Step5: on all 3 nodes
#configure kubernetes repo
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
enabled=1
gpgcheck=1
repo_gpgcheck=1
EOF
Step6: Install docker on all 3 nodes
yum install docker -y
Step7: on all 3 nodes
yum install -y kubeadm kubectl kubelet --disableexcludes=kubernetes
systemctl daemon-reload
systemctl restart docker && systemctl enable --now docker
systemctl restart kubelet && systemctl enable --now kubelet
Step8: on master
kubeadm init --pod-network-cidr 10.244.0.0/16
Step9: on master
$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
Step10: on master
Step11: on all worker nodes
Use the join token from master to join worker nodes to the K8s cluster.
Example:
kubeadm join 192.168.105.100:6443 --token 4j1cjv.gcj2hx9suq5akc7v \
--discovery-token-ca-cert-hash sha256:b2ead930d772d8af2e45ca8c86c3895b092484c10f8034e52f917e71dc4c3fea