Saturday, May 20, 2023

vSphere with Tanzu using NSX-T - Part25 - Spherelet

The Spherelet is based on the Kubernetes “Kubelet” and enables an ESXi hypervisor to act as a Kubernetes worker node. Sometimes you may notice that the worker nodes of your supervisor cluster are having NotReady,SchedulingDisabled status, and it maybe becuase spherelet is not running on those ESXi nodes.

Following are the steps to verify the status of spherelet service, and restart them if required.

Example:
❯ kubectx wdc-01-vcxx
Switched to context "wdc-01-vcxx".
❯ kubectl get node
NAME                               STATUS                        ROLES                  AGE    VERSION
42019f7e751b2818bb0c659028d49fdc   Ready                         control-plane,master   317d   v1.22.6+vmware.wcp.2
4201b0b21aed78d8e72bfb622bb8b98b   Ready                         control-plane,master   317d   v1.22.6+vmware.wcp.2
4201c53dcef2701a8c36463942d762dc   Ready                         control-plane,master   317d   v1.22.6+vmware.wcp.2
wdc-01-rxxesx04.xxxxxxxxx.com      Ready                         agent                  317d   v1.22.6-sph-db56d46
wdc-01-rxxesx05.xxxxxxxxx.com      NotReady,SchedulingDisabled   agent                  317d   v1.22.6-sph-db56d46
wdc-01-rxxesx06.xxxxxxxxx.com      Ready                         agent                  317d   v1.22.6-sph-db56d46
wdc-01-rxxesx32.xxxxxxxxx.com      Ready                         agent                  317d   v1.22.6-sph-db56d46
wdc-01-rxxesx33.xxxxxxxxx.com      Ready                         agent                  317d   v1.22.6-sph-db56d46
wdc-01-rxxesx34.xxxxxxxxx.com      Ready                         agent                  317d   v1.22.6-sph-db56d46
wdc-01-rxxesx35.xxxxxxxxx.com      Ready,SchedulingDisabled      agent                  317d   v1.22.6-sph-db56d46
wdc-01-rxxesx36.xxxxxxxxx.com      Ready                         agent                  317d   v1.22.6-sph-db56d46
wdc-01-rxxesx37.xxxxxxxxx.com      Ready                         agent                  317d   v1.22.6-sph-db56d46
wdc-01-rxxesx38.xxxxxxxxx.com      Ready                         agent                  317d   v1.22.6-sph-db56d46
wdc-01-rxxesx39.xxxxxxxxx.com      NotReady,SchedulingDisabled   agent                  317d   v1.22.6-sph-db56d46
wdc-01-rxxesx40.xxxxxxxxx.com      Ready                         agent                  317d   v1.22.6-sph-db56d46

Logs

  • ssh into the ESXi worker node.
tail -f /var/log/spherelet.log 


Status

  • ssh into the ESXi worker node and run the following:
etc/init.d/spherelet status
  •  You can check status of spherelet using PowerCLI. Following is an example:
> Connect-VIServer wdc-10-vcxx

> Get-VMHost | Get-VMHostService | where {$_.Key -eq "spherelet"}  | select VMHost,Key,Running | ft

VMHost                        Key       Running
------                        ---       -------
wdc-10-r0xxxxxxxxxxxxxxxxxxxx spherelet    True
wdc-10-r0xxxxxxxxxxxxxxxxxxxx spherelet    True
wdc-10-r0xxxxxxxxxxxxxxxxxxxx spherelet    True
wdc-10-r0xxxxxxxxxxxxxxxxxxxx spherelet    True
wdc-10-r0xxxxxxxxxxxxxxxxxxxx spherelet    True
wdc-10-r0xxxxxxxxxxxxxxxxxxxx spherelet    True
wdc-10-r0xxxxxxxxxxxxxxxxxxxx spherelet    True
wdc-10-r0xxxxxxxxxxxxxxxxxxxx spherelet    True
wdc-10-r0xxxxxxxxxxxxxxxxxxxx spherelet    True
wdc-10-r0xxxxxxxxxxxxxxxxxxxx spherelet    True
wdc-10-r0xxxxxxxxxxxxxxxxxxxx spherelet    True
wdc-10-r0xxxxxxxxxxxxxxxxxxxx spherelet    True

Restart

  • ssh into the ESXi worker node and run the following:
/etc/init.d/spherelet restart
  • You can also restart spherelet service using PowerCLI. Following is an example to restart spherelet service on ALL the ESXi worker nodes of a cluster:
> Get-Cluster

Name                           HAEnabled  HAFailover DrsEnabled DrsAutomationLevel
                                          Level
----                           ---------  ---------- ---------- ------------------
wdc-10-vcxxc01                 True       1          True       FullyAutomated

> Get-Cluster -Name wdc-10-vcxxc01 | Get-VMHost | foreach { Restart-VMHostService -HostService ($_ | Get-VMHostService | where {$_.Key -eq "spherelet"}) }

Certificates

You may notice the ESXi worker nodes in NotReady state when the following spherelet certs expire.
  • /etc/vmware/spherelet/spherelet.crt
  • /etc/vmware/spherelet/client.crt
 
An example is given below:
❯ kg no
NAME STATUS ROLES AGE VERSION
420802008ec0d8ccaa6ac84140768375 Ready control-plane,master 70d v1.22.6+vmware.wcp.2
42087a63440b500de6cec759bb5900bf Ready control-plane,master 77d v1.22.6+vmware.wcp.2
4208e08c826dfe283c726bc573109dbb Ready control-plane,master 77d v1.22.6+vmware.wcp.2
wdc-08-rxxesx25.xxxxxxxxx.com NotReady agent 370d v1.22.6-sph-db56d46
wdc-08-rxxesx26.
xxxxxxxxx.com NotReady agent 370d v1.22.6-sph-db56d46
wdc-08-rxxesx23.
xxxxxxxxx.com NotReady agent 370d v1.22.6-sph-db56d46
wdc-08-rxxesx24.
xxxxxxxxx.com NotReady agent 370d v1.22.6-sph-db56d46
wdc-08-rxxesx25.
xxxxxxxxx.com NotReady agent 370d v1.22.6-sph-db56d46
wdc-08-rxxesx26.
xxxxxxxxx.com NotReady agent 370d v1.22.6-sph-db56d46

You can ssh into the ESXi worker nodes and verify the validity of the above mentioned certs. They have a life time of one year.
 
Example:
[root@wdc-08-rxxesx25:~] openssl x509 -enddate -noout -in /etc/vmware/spherelet/spherelet.crt
notAfter=Sep 1 08:32:24 2023 GMT
[root@wdc-08-rxxesx25:~] openssl x509 -enddate -noout -in /etc/vmware/spherelet/client.crt
notAfter=Sep 1 08:32:24 2023 GMT
Depending on your support contract, if its a production environment you may need to open a case with VMware GSS for resolving this issue. 
 
Ref KBs: 


Verify

❯ kubectl get node
NAME                               STATUS   ROLES                  AGE     VERSION
42017dcb669bea2962da27fc2f6c16d2   Ready    control-plane,master   5d20h   v1.23.12+vmware.wcp.1
4201b763c766875b77bcb9f04f8840b3   Ready    control-plane,master   5d21h   v1.23.12+vmware.wcp.1
4201dab068e9b2d3af3b8fde450b3d96   Ready    control-plane,master   5d20h   v1.23.12+vmware.wcp.1
wdc-01-rxxesx04.xxxxxxxxx.com      Ready    agent                  5d19h   v1.23.5-sph-81ef5d1
wdc-01-rxxesx05.xxxxxxxxx.com      Ready    agent                  5d19h   v1.23.5-sph-81ef5d1
wdc-01-rxxesx06.xxxxxxxxx.com      Ready    agent                  5d19h   v1.23.5-sph-81ef5d1
wdc-01-rxxesx32.xxxxxxxxx.com      Ready    agent                  5d19h   v1.23.5-sph-81ef5d1
wdc-01-rxxesx33.xxxxxxxxx.com      Ready    agent                  5d19h   v1.23.5-sph-81ef5d1
wdc-01-rxxesx34.xxxxxxxxx.com      Ready    agent                  5d19h   v1.23.5-sph-81ef5d1
wdc-01-rxxesx35.xxxxxxxxx.com      Ready    agent                  5d19h   v1.23.5-sph-81ef5d1
wdc-01-rxxesx36.xxxxxxxxx.com      Ready    agent                  5d19h   v1.23.5-sph-81ef5d1
wdc-01-rxxesx37.xxxxxxxxx.com      Ready    agent                  5d19h   v1.23.5-sph-81ef5d1
wdc-01-rxxesx38.xxxxxxxxx.com      Ready    agent                  5d19h   v1.23.5-sph-81ef5d1
wdc-01-rxxesx39.xxxxxxxxx.com      Ready    agent                  5d19h   v1.23.5-sph-81ef5d1
wdc-01-rxxesx40.xxxxxxxxx.com      Ready    agent                  5d19h   v1.23.5-sph-81ef5d1
   

Hope it was useful. Cheers!

Sunday, May 7, 2023

Kubernetes 101 - Part9 - kubeconfig certificate expiration

You can verify the expiration date of kubeconfig in the current context as follows:

kubectl config view --minify --raw --output 'jsonpath={..user.client-certificate-data}' | base64 -d | openssl x509 -noout -enddate

❯ k config current-context
sc2-01-vcxx

❯ kubectl config view --minify --raw --output 'jsonpath={..user.client-certificate-data}' | base64 -d | openssl x509 -noout -enddate
notAfter=Sep 6 05:13:47 2023 GMT

❯ date
Thu Sep 7 18:05:52 IST 2023


Hope it was useful. Cheers!