Saturday, June 11, 2022

Working with Kubernetes using Python - Part 04 - Get namespaces

Following code snipet uses Python client for the kubernetes API to get namespace details from a given context:
from kubernetes import client, config
import argparse


def load_kubeconfig(context_name):
config.load_kube_config(context=f"{context_name}")
v1 = client.CoreV1Api()
return v1


def get_all_namespace(v1):
print("Listing namespaces with their creation timestamp, and status:")
ret = v1.list_namespace()
for i in ret.items:
print(i.metadata.name, i.metadata.creation_timestamp, i.status.phase)


def main():
parser = argparse.ArgumentParser()
parser.add_argument("-c", "--context", required=True, help="K8s context")
args = parser.parse_args()

context = args.context
v1 = load_kubeconfig(context)
get_all_namespace(v1)


if __name__ == "__main__":
main()


Saturday, May 21, 2022

vSphere with Tanzu using NSX-T - Part15 - Working with etcd on TKC with one control plane

In this article, we will see how to work with etcd database of a Tanzu Kubernetes Cluster (TKC) with one control plane node and perform some basic operations. Following is a TKC with one control plane node and three worker nodes:

Get K8s cluster nodes
❯ gcc kg no
NAME STATUS ROLES AGE VERSION
gc-control-plane-6g9gk Ready control-plane,master 3d7h v1.21.6+vmware.1
gc-workers-rmgkm-78cf46d595-n5qp8 Ready <none> 7d19h v1.21.6+vmware.1
gc-workers-rmgkm-78cf46d595-wds2m Ready <none> 7d19h v1.21.6+vmware.1
gc-workers-rmgkm-78cf46d595-z2wvt Ready <none> 7d19h v1.21.6+vmware.1
Get the etcd pod and describe it
❯ gcc kg pod -A | grep etcd
kube-system etcd-gc-control-plane-6g9gk 1/1 Running 0 3d7h

❯ gcc kd pod etcd-gc-control-plane-6g9gk -n kube-system
Name: etcd-gc-control-plane-6g9gk
Namespace: kube-system
Priority: 2000001000
Priority Class Name: system-node-critical
Node: gc-control-plane-6g9gk/100.68.36.38
Start Time: Tue, 19 Jul 2022 11:14:25 +0530
Labels: component=etcd
tier=control-plane
Annotations: kubeadm.kubernetes.io/etcd.advertise-client-urls: https://100.68.36.38:2379
kubernetes.io/config.hash: 6e7bc05d35060112913f78af2043683f
kubernetes.io/config.mirror: 6e7bc05d35060112913f78af2043683f
kubernetes.io/config.seen: 2022-07-19T05:44:19.416549595Z
kubernetes.io/config.source: file
kubernetes.io/psp: vmware-system-privileged
Status: Running
IP: 100.68.36.38
IPs:
IP: 100.68.36.38
Controlled By: Node/gc-control-plane-6g9gk
Containers:
etcd:
Container ID: containerd://253c7b25bd60ea78dfccad52d03534785f0d7b7a1fa7105dbd55d7727f8785c3
Image: localhost:5000/vmware.io/etcd:v3.4.13_vmware.22
Image ID: sha256:78661ebbe1adaee60336a0f8ff031c4537ff309ef51feab6e840e7dbb3cbf47d
Port: <none>
Host Port: <none>
Command:
etcd
--advertise-client-urls=https://100.68.36.38:2379
--cert-file=/etc/kubernetes/pki/etcd/server.crt
--client-cert-auth=true
--data-dir=/var/lib/etcd
--initial-advertise-peer-urls=https://100.68.36.38:2380
--initial-cluster=gc-control-plane-6g9gk=https://100.68.36.38:2380,gc-control-plane-64lq5=https://100.68.36.34:2380
--initial-cluster-state=existing
--key-file=/etc/kubernetes/pki/etcd/server.key
--listen-client-urls=https://127.0.0.1:2379,https://100.68.36.38:2379
--listen-metrics-urls=http://127.0.0.1:2381
--listen-peer-urls=https://100.68.36.38:2380
--name=gc-control-plane-6g9gk
--peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
--peer-client-cert-auth=true
--peer-key-file=/etc/kubernetes/pki/etcd/peer.key
--peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
--snapshot-count=10000
--trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
State: Running
Started: Tue, 19 Jul 2022 11:14:27 +0530
Ready: True
Restart Count: 0
Requests:
cpu: 100m
memory: 100Mi
Liveness: http-get http://127.0.0.1:2381/health delay=10s timeout=15s period=10s #success=1 #failure=8
Startup: http-get http://127.0.0.1:2381/health delay=10s timeout=15s period=10s #success=1 #failure=24
Environment: <none>
Mounts:
/etc/kubernetes/pki/etcd from etcd-certs (rw)
/var/lib/etcd from etcd-data (rw)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
etcd-certs:
Type: HostPath (bare host directory volume)
Path: /etc/kubernetes/pki/etcd
HostPathType: DirectoryOrCreate
etcd-data:
Type: HostPath (bare host directory volume)
Path: /var/lib/etcd
HostPathType: DirectoryOrCreate
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: :NoExecute op=Exists
Events: <none>

Exec into the etcd pod and run etcdctl commands

You can use etcdctl and you need to provide cacert, cert, and key details. All these info you will get while describing the etcd pod. 

❯ gcc k exec -it etcd-gc-control-plane-6g9gk -n kube-system -- sh -c "ETCDCTL_API=3 etcdctl member list --endpoints=127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key"
c5c44d96f675add8, started, gc-control-plane-6g9gk, https://100.68.36.38:2380, https://100.68.36.38:2379, false


❯ gcc k exec -it etcd-gc-control-plane-6g9gk -n kube-system -- sh -c "ETCDCTL_API=3 etcdctl endpoint health --endpoints=127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key --write-out json"
[{"endpoint":"127.0.0.1:2379","health":true,"took":"9.689387ms"}]

❯ gcc k exec -it etcd-gc-control-plane-6g9gk -n kube-system -- sh -c "ETCDCTL_API=3 etcdctl
endpoint status --endpoints=127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key -w json"
[{"Endpoint":"127.0.0.1:2379","Status":{"header":{"cluster_id":4073335150581888229,"member_id":14250600431682432472,"revision":2153804,"raft_term":11},"version":"3.4.13","dbSize":24719360,"leader":14250600431682432472,"raftIndex":2429139,"raftTerm":11,"raftAppliedIndex":2429139,"dbSizeInUse":2678784}}]


Snapshot etcd
❯ gcc k exec -it etcd-gc-control-plane-6g9gk -n kube-system -- sh -c "ETCDCTL_API=3 etcdctl --endpoints=127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key snapshot save snapshotdb-$(date +%d-%m-%y)"
{"level":"info","ts":1658651541.2698474,"caller":"snapshot/v3_snapshot.go:119","msg":"created temporary db file","path":"snapshotdb-24-07-22.part"}
{"level":"info","ts":"2022-07-24T08:32:21.277Z","caller":"clientv3/maintenance.go:200","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":1658651541.2771788,"caller":"snapshot/v3_snapshot.go:127","msg":"fetching snapshot","endpoint":"127.0.0.1:2379"}
{"level":"info","ts":"2022-07-24T08:32:21.594Z","caller":"clientv3/maintenance.go:208","msg":"completed snapshot read; closing"}
{"level":"info","ts":1658651541.621639,"caller":"snapshot/v3_snapshot.go:142","msg":"fetched snapshot","endpoint":"127.0.0.1:2379","size":"25 MB","took":0.344746859}
{"level":"info","ts":1658651541.621852,"caller":"snapshot/v3_snapshot.go:152","msg":"saved","path":"snapshotdb-24-07-22"}
Snapshot saved at snapshotdb-24-07-22

❯ gcc k exec -it etcd-gc-control-plane-6g9gk -n kube-system -- sh
# ls
bin boot dev etc home lib lib64 media mnt opt proc root run sbin snapshotdb-24-07-22 srv sys tmp usr var
#
# exit

❯ gcc k exec -it etcd-gc-control-plane-6g9gk -n kube-system -- sh -c "ETCDCTL_API=3 etcdctl --endpoints=127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key snapshot status snapshotdb-24-07-22 -w table"
+----------+----------+------------+------------+
| HASH | REVISION | TOTAL KEYS | TOTAL SIZE |
+----------+----------+------------+------------+
|
b0910e83 | 2434362 | 1580 | 25 MB |
+----------+----------+------------+------------+

Copy snapshot file from etcd pod to local machine

Note: Even though there was an error while copying the snapshot file from the pod to local machine, you can see the file was successfully copied and I also verified the snapshot file status using etcdctl. Every field (hash, total_keys, etc.) matches with that of the source file.

❯ gcc kubectl cp kube-system/etcd-gc-control-plane-6g9gk:/snapshotdb-24-07-22 snapshotdb-24-07-22
tar: Removing leading `/' from member names
error: unexpected EOF
❯ ls snapshotdb-24-07-22
snapshotdb-24-07-22
❯ ETCDCTL_API=3 etcdctl snapshot status snapshotdb-24-07-22 -w table
Deprecated: Use `etcdutl snapshot status` instead.

+----------+----------+------------+------------+
| HASH | REVISION | TOTAL KEYS | TOTAL SIZE |
+----------+----------+------------+------------+
| b0910e83 | 2434362 | 1580 | 25 MB |
+----------+----------+------------+------------+   

Restore etcd 

We can restore the etcd snapshot using etcdctl from the TKC control plane node. Inorder to connect to the control plane VM, we need to create a jumpbox pod under the corresponding supervisor namespace.

So, first copy the snapshot file from local machine to jumpbox pod. 

❯ ls snapshotdb-24-07-22
snapshotdb-24-07-22
❯ kubectl cp snapshotdb-24-07-22 vineetha-test04-deploy/jumpbox01:/

❯ k exec -it jumpbox01 -n vineetha-test04-deploy -- sh
sh-4.4# su
root [ / ]# ls
bin dev home lib64 mnt root sbin srv tmp var
boot etc lib media proc run snapshotdb-24-07-22 sys usr
root [ / ]#
Copy the snapshot file from jumpbox pod to control plane node.
❯ gcc kg po -A -o wide| grep etcd
kube-system etcd-gc-control-plane-6g9gk 1/1 Running 0 166m 100.68.36.38 gc-control-plane-6g9gk <none> <none>

❯ k exec -it jumpbox01 -n vineetha-test04-deploy -- scp /snapshotdb-24-07-22 vmware-system-user@100.68.36.38:/tmp
snapshotdb-24-07-22 100% 20MB 126.1MB/s 00:00

❯ k exec -it jumpbox01 -n vineetha-test04-deploy -- /usr/bin/ssh vmware-system-user@100.68.36.38
Welcome to Photon 3.0 (\m) - Kernel \r (\l)
Last login: Sun Jul 24 13:02:39 2022 from 100.68.35.210
13:14:29 up 5 days, 7:38, 0 users, load average: 0.98, 0.53, 0.31

26 Security notice(s)
Run 'tdnf updateinfo info' to see the details.
vmware-system-user@gc-control-plane-6g9gk [ ~ ]$ sudo su
root [ /home/vmware-system-user ]#
root [ /home/vmware-system-user ]# cd /tmp/
 
Install etcd on the control plane node, so that we get to access etcdctl utility.
root [ /tmp ]# tdnf install etcd
root [ /tmp ]# ETCDCTL_API=3 etcdctl member list --endpoints=127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key
c5c44d96f675add8, started, gc-control-plane-6g9gk, https://100.68.36.38:2380, https://100.68.36.38:2379, false
root [ /tmp ]#
root [ /tmp ]# ETCDCTL_API=3 etcdctl snapshot status snapshotdb-24-07-22 --endpoints=127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key
Deprecated: Use `etcdutl snapshot status` instead.

b0910e83, 2434362, 1580, 25 MB
root [ /tmp ]# hostname
gc-control-plane-6g9gk
root [ /tmp ]# ETCDCTL_API=3 etcdctl --endpoints=127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key snapshot restore /tmp/snapshotdb-24-07-22 --data-dir=/var/lib/etcd-backup --skip-hash-check=true
Deprecated: Use `etcdutl snapshot restore` instead.

2022-07-24T13:20:44Z info snapshot/v3_snapshot.go:251 restoring snapshot {"path": "/tmp/snapshotdb-24-07-22", "wal-dir": "/var/lib/etcd-backup/member/wal", "data-dir": "/var/lib/etcd-backup", "snap-dir": "/var/lib/etcd-backup/member/snap", "stack": "go.etcd.io/etcd/etcdutl/v3/snapshot.(*v3Manager).Restore\n\t/usr/src/photon/BUILD/etcd-3.5.1/etcdutl/snapshot/v3_snapshot.go:257\ngo.etcd.io/etcd/etcdutl/v3/etcdutl.SnapshotRestoreCommandFunc\n\t/usr/src/photon/BUILD/etcd-3.5.1/etcdutl/etcdutl/snapshot_command.go:147\ngo.etcd.io/etcd/etcdctl/v3/ctlv3/command.snapshotRestoreCommandFunc\n\t/usr/src/photon/BUILD/etcd-3.5.1/etcdctl/ctlv3/command/snapshot_command.go:128\ngithub.com/spf13/cobra.(*Command).execute\n\t/usr/share/gocode/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/usr/share/gocode/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:960\ngithub.com/spf13/cobra.(*Command).Execute\n\t/usr/share/gocode/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:897\ngo.etcd.io/etcd/etcdctl/v3/ctlv3.Start\n\t/usr/src/photon/BUILD/etcd-3.5.1/etcdctl/ctlv3/ctl.go:107\ngo.etcd.io/etcd/etcdctl/v3/ctlv3.MustStart\n\t/usr/src/photon/BUILD/etcd-3.5.1/etcdctl/ctlv3/ctl.go:111\nmain.main\n\t/usr/src/photon/BUILD/etcd-3.5.1/etcdctl/main.go:59\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:250"}
2022-07-24T13:20:44Z info membership/store.go:141 Trimming membership information from the backend...
2022-07-24T13:20:44Z info membership/cluster.go:421 added member {"cluster-id": "cdf818194e3a8c32", "local-member-id": "0", "added-peer-id": "8e9e05c52164694d", "added-peer-peer-urls": ["http://localhost:2380"]}
2022-07-24T13:20:44Z info snapshot/v3_snapshot.go:272 restored snapshot {"path": "/tmp/snapshotdb-24-07-22", "wal-dir": "/var/lib/etcd-backup/member/wal", "data-dir": "/var/lib/etcd-backup", "snap-dir": "/var/lib/etcd-backup/member/snap"}
root [ /var/lib ]#
 
We have restored the database snapshot to a new location:  --data-dir=/var/lib/etcd-backup. So we need to modify the etcd-data hostpath to path: /var/lib/etcd-backup in the etcd static pod manifest file (etcd.yaml). Copy the contents of etcd.yaml file.
root [ /var/lib ]# cd /etc/kubernetes/manifests/
root [ /etc/kubernetes/manifests ]# ls
etcd.yaml kube-controller-manager.yaml registry.yaml
kube-apiserver.yaml kube-scheduler.yaml
root [ /etc/kubernetes/manifests ]# cat etcd.yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
kubeadm.kubernetes.io/etcd.advertise-client-urls: https://100.68.36.38:2379
creationTimestamp: null
labels:
component: etcd
tier: control-plane
name: etcd
namespace: kube-system
spec:
containers:
- command:
- etcd
- --advertise-client-urls=https://100.68.36.38:2379
- --cert-file=/etc/kubernetes/pki/etcd/server.crt
- --client-cert-auth=true
- --data-dir=/var/lib/etcd
- --initial-advertise-peer-urls=https://100.68.36.38:2380
- --initial-cluster=gc-control-plane-6g9gk=https://100.68.36.38:2380,gc-control-plane-64lq5=https://100.68.36.34:2380
- --initial-cluster-state=existing
- --key-file=/etc/kubernetes/pki/etcd/server.key
- --listen-client-urls=https://127.0.0.1:2379,https://100.68.36.38:2379
- --listen-metrics-urls=http://127.0.0.1:2381
- --listen-peer-urls=https://100.68.36.38:2380
- --name=gc-control-plane-6g9gk
- --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
- --peer-client-cert-auth=true
- --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
- --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
- --snapshot-count=10000
- --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
image: localhost:5000/vmware.io/etcd:v3.4.13_vmware.22
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /health
port: 2381
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
name: etcd
resources:
requests:
cpu: 100m
memory: 100Mi
startupProbe:
failureThreshold: 24
httpGet:
host: 127.0.0.1
path: /health
port: 2381
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
volumeMounts:
- mountPath: /var/lib/etcd
name: etcd-data
- mountPath: /etc/kubernetes/pki/etcd
name: etcd-certs
hostNetwork: true
priorityClassName: system-node-critical
volumes:
- hostPath:
path: /etc/kubernetes/pki/etcd
type: DirectoryOrCreate
name: etcd-certs
- hostPath:
path: /var/lib/etcd
type: DirectoryOrCreate
name: etcd-data
status: {}
I was having difficulties in the terminal to modify it. So I copied the contents of etcd.yaml file locally, modified the path, removed the existing etcd.yaml file, created new etcd.yaml file, and pasted the modifed content in it.
root [ /etc/kubernetes/manifests ]# rm etcd.yaml
root [ /etc/kubernetes/manifests ]# vi etcd.yaml

<paste the above etcd.yaml file contents, with modified etcd-data hostPath, last part of the yaml will look like below:

volumes:
- hostPath:
path: /etc/kubernetes/pki/etcd
type: DirectoryOrCreate
name: etcd-certs
- hostPath:
path: /var/lib/etcd-backup
type: DirectoryOrCreate
name: etcd-data
status: {}

>
Once the etcd.yaml is saved, after few seconds you can see that etcd pod will be running.
root [ /etc/kubernetes/manifests ]# crictl ps -a
CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID
92dad43a85ebc 78661ebbe1ada 1 second ago Running etcd 0 8258804cf17bb
5c704092eb4bb 25605c4ab20fe 10 seconds ago Running csi-resizer 10 1fa9feb732df5
495b5ff250cb4 05cfd9e3c3f22 10 seconds ago Running csi-provisioner 22 1fa9feb732df5
c0da43af7f1d0 a145efcc3afb4 11 seconds ago Running vsphere-syncer 10 1fa9feb732df5
4e6d67dc16f4a 5cb2119a4d797 11 seconds ago Running kube-controller-manager 11 3d1c266aa5e24
8710a7b6a8563 fa70d7ee973ad 11 seconds ago Running guest-cluster-cloud-provider 35 3f3d5eb0929e5
4e0c1e2e72682 f18cde23836f5 11 seconds ago Running csi-attacher 10 1fa9feb732df5
bf6771ca4dc4d a609b91a17410 11 seconds ago Running kube-scheduler 11 f50eada3f8127
05fa3f2f587e8 fa70d7ee973ad 3 hours ago Exited guest-cluster-cloud-provider 34 3f3d5eb0929e5
888ba7ce34d92 3f6d2884f8105 3 hours ago Running kube-apiserver 28 c368bd9937f8b
6579ffffa5f53 382a8821c56e0 3 hours ago Running metrics-server 21 622c835648008
bdd0c760d0bc7 382a8821c56e0 3 hours ago Exited metrics-server 20 622c835648008
6063485d73e38 78661ebbe1ada 3 hours ago Exited etcd 0 71e6ae04726ad
259595a330d26 3f6d2884f8105 3 hours ago Exited kube-apiserver 27 c368bd9937f8b
5034f4ea18f1f 25605c4ab20fe 4 hours ago Exited csi-resizer 9 1fa9feb732df5
21de3c4850dc3 05cfd9e3c3f22 4 hours ago Exited csi-provisioner 21 1fa9feb732df5
d623946b1d270 a145efcc3afb4 4 hours ago Exited vsphere-syncer 9 1fa9feb732df5
5f50b9e93d287 a609b91a17410 4 hours ago Exited kube-scheduler 10 f50eada3f8127
ec4e066f54fd6 5cb2119a4d797 4 hours ago Exited kube-controller-manager 10 3d1c266aa5e24
f05fee251a700 f18cde23836f5 4 hours ago Exited csi-attacher 9 1fa9feb732df5
d3577bf8477d0 b0f879c3b53ce 5 days ago Running liveness-probe 0 1fa9feb732df5
d30ba30b8c203 4251b7012fd43 5 days ago Running vsphere-csi-controller 0 1fa9feb732df5
1816ed6aada5f 02abc4bd595a0 5 days ago Running guest-cluster-auth-service 0 bff70bcd389be
3e014a6745f5a b0f879c3b53ce 5 days ago Running liveness-probe 0 66e5ca3abfe5b
ba82f29cf8939 4251b7012fd43 5 days ago Running vsphere-csi-node 0 66e5ca3abfe5b
9c29960956718 f3fe18dd8cea2 5 days ago Running node-driver-registrar 0 66e5ca3abfe5b
92aa3a904d72e 0515f8357a522 5 days ago Running antrea-agent 3 4ae63a9f8e4cb
9e0f32fa88663 0515f8357a522 5 days ago Exited antrea-agent 2 4ae63a9f8e4cb
ac5573a4d93f8 0515f8357a522 5 days ago Running antrea-ovs 0 4ae63a9f8e4cb
e14fea2c37c21 0515f8357a522 5 days ago Exited install-cni 0 4ae63a9f8e4cb
166627360a434 7fde82047d4f6 5 days ago Running docker-registry 0 c84c550a9ab90
ecfbdcd23858d f31127f4a3471 5 days ago Running kube-proxy 0 07f5b9be02414
root [ /etc/kubernetes/manifests ]# exit
exit
vmware-system-user@gc-control-plane-6g9gk [ ~ ]$
vmware-system-user@gc-control-plane-6g9gk [ ~ ]$ exit
logout

Verify

In my case I had a namespace vineethac-testing with two nginx pods running under it while the snapshot was taken. After the snapshot was taken, I deleted the two nginx pods and the namespace vineethac-testing. After restoring the etcd snapshot, I can see that the namespace vineethac-testing is active with two nginx pods under it.

❯ gcc kg ns
NAME STATUS AGE
default Active 9d
kube-node-lease Active 9d
kube-public Active 9d
kube-system Active 9d
vineethac-testing Active 4h28m
vmware-system-auth Active 9d
vmware-system-cloud-provider Active 9d
vmware-system-csi Active 9d

❯ gcc kg pods -n vineethac-testing
NAME READY STATUS RESTARTS AGE
nginx1 1/1 Running 0 4h25m
nginx2 1/1 Running 0 4h23m 

Hope it was useful. Cheers!

Note: I've tested this in a lab. This may not be the best practice procedure and may slightly vary in a real world environment.

Friday, April 8, 2022

Working with Kubernetes using Python - Part 03 - Get nodes

Following code snipet uses kubeconfig python module to switch context and Python client for the kubernetes API to get cluster node details. It takes the default kubeconfig file, and switch to the required context, and get node info of the respective cluster. 

kubectl commands: 

kubectl config get-contexts
kubectl config current-context
kubectl config use-context <context_name>
kubectl get nodes -o json

Code: 

Reference:

https://kubeconfig-python.readthedocs.io/en/latest/
https://github.com/kubernetes-client/python

Hope it was useful. Cheers!

Saturday, March 19, 2022

Working with Kubernetes using Python - Part 02 - Switch context

Following code snipet uses kubeconfig python module and it takes the default kubeconfig file, and switch to a new context. 

kubectl commands: 

kubectl config get-contexts
kubectl config current-context
kubectl config use-context <context_name>

Code: 

Note: If you want to use a specific kubeconfig file, instead of  conf = KubeConfig() 

you can use conf = KubeConfig('path-to-your-kubeconfig')

Reference:

https://kubeconfig-python.readthedocs.io/en/latest/

Hope it was useful. Cheers!

Friday, March 4, 2022

Working with Kubernetes using Python - Part 01 - List contexts

Following code snipet uses Python client for the kubernetes API and it takes the default kubeconfig file, list the contexts, and active context. 

kubectl commands: 

kubectl config get-contexts
kubectl config current-context

Code: 

Note: If you want to use a specific kubeconfig file, instead of  

contexts, active_context = config.list_kube_config_contexts() 

you can use  

contexts, active_context = config.list_kube_config_contexts(config_file="path-to-kubeconfig")

Reference:

https://github.com/kubernetes-client/python

Hope it was useful. Cheers!

Saturday, February 19, 2022

Kubernetes 101 - Part5 - Removing namespaces stuck in terminating state

Namespaces getting stuck at terminating state is one of the common issues I have seen while working with K8s. Here is an example namespace in terminating state and you can see there are no resources under it. In this case we are removing this namespace by setting the finalizers to null.

% kg ns rohitgu-intelligence-cluster-6
NAME                             STATUS        AGE
rohitgu-intelligence-cluster-6   Terminating   188d

% kg pods,tkc,all,cluster-api -n rohitgu-intelligence-cluster-6
No resources found in rohitgu-intelligence-cluster-6 namespace.

% kg ns rohitgu-intelligence-cluster-6 -o json > rohitgu-intelligence-cluster-6-json

% jq '.spec.finalizers = [] | .metadata.finalizers = []' rohitgu-intelligence-cluster-6-json > rohitgu-intelligence-cluster-6-json-nofinalizer

% cat rohitgu-intelligence-cluster-6-json-nofinalizer
{
  "apiVersion": "v1",
  "kind": "Namespace",
  "metadata": {
    "annotations": {
      "calaxxxxx.xxxx.com/ccsrole-created": "true",
      "calaxxxxx.xxxx.com/owner": "rohitgu",
      "calaxxxxx.xxxx.com/user-namespace": "rohitgu",
      "ls_id-0": "e28442b5-ace0-4e20-b5a0-c32bc72427d9",
      "ncp/extpoolid": "domain-c1034:02cde809-99d1-423e-aac9-014889740308-ippool-10-186-120-1-10-186-123-254",
      "ncp/router_id": "t1_87d44fc8-ac60-441a-8e35-509ff31a4eba_rtr",
      "ncp/snat_ip": "10.186.120.40",
      "ncp/subnet-0": "172.29.1.144/28",
      "vmware-system-resource-pool": "resgroup-663809",
      "vmware-system-vm-folder": "group-v663810"
    },
    "creationTimestamp": "2021-08-26T09:11:39Z",
    "deletionTimestamp": "2022-03-02T07:16:27Z",
    "labels": {
      "kubernetes.io/metadata.name": "rohitgu-intelligence-cluster-6",
      "vSphereClusterID": "domain-c1034"
    },
    "name": "rohitgu-intelligence-cluster-6",
    "resourceVersion": "1133900371",
    "selfLink": "/api/v1/namespaces/rohitgu-intelligence-cluster-6",
    "uid": "87d44fc8-ac60-441a-8e35-509ff31a4eba",
    "finalizers": []
  },
  "spec": {
    "finalizers": []
  },
  "status": {
    "conditions": [
      {
        "lastTransitionTime": "2022-03-02T07:16:32Z",
        "message": "Discovery failed for some groups, 1 failing: unable to retrieve the complete list of server APIs: data.packaging.carvel.dev/v1alpha1: the server is currently unable to handle the request",
        "reason": "DiscoveryFailed",
        "status": "True",
        "type": "NamespaceDeletionDiscoveryFailure"
      },
      {
        "lastTransitionTime": "2022-03-02T07:16:56Z",
        "message": "All legacy kube types successfully parsed",
        "reason": "ParsedGroupVersions",
        "status": "False",
        "type": "NamespaceDeletionGroupVersionParsingFailure"
      },
      {
        "lastTransitionTime": "2022-03-02T07:16:56Z",
        "message": "All content successfully deleted, may be waiting on finalization",
        "reason": "ContentDeleted",
        "status": "False",
        "type": "NamespaceDeletionContentFailure"
      },
      {
        "lastTransitionTime": "2022-03-02T07:23:22Z",
        "message": "All content successfully removed",
        "reason": "ContentRemoved",
        "status": "False",
        "type": "NamespaceContentRemaining"
      },
      {
        "lastTransitionTime": "2022-03-02T07:23:22Z",
        "message": "All content-preserving finalizers finished",
        "reason": "ContentHasNoFinalizers",
        "status": "False",
        "type": "NamespaceFinalizersRemaining"
      }
    ],
    "phase": "Terminating"
  }
}

% kubectl replace --raw "/api/v1/namespaces/rohitgu-intelligence-cluster-6/finalize" -f rohitgu-intelligence-cluster-6-json-nofinalizer

% kg ns rohitgu-intelligence-cluster-6
Error from server (NotFound): namespaces "rohitgu-intelligence-cluster-6" not found

Hope it was useful. Cheers!

Sunday, January 30, 2022

vSphere with Tanzu using NSX-T - Part14 - Testing TKC storage using kubestr

In the previous posts we discussed the following:

This article is about using kubestr to test storage options of Tanzu Kubernetes Cluster (TKC). Following are the steps to install kubestr on MAC:

  • wget https://github.com/kastenhq/kubestr/releases/download/v0.4.31/kubestr_0.4.31_MacOS_amd64.tar.gz
  • tar -xvf kubestr_0.4.31_MacOS_amd64.tar.gz 
  • chmod +x kubestr
  • mv kubestr /usr/local/bin

 

Now, lets do kubestr help.

% kubestr help
kubestr is a tool that will scan your k8s cluster
        and validate that the storage systems in place as well as run
        performance tests.

Usage:
  kubestr [flags]
  kubestr [command]

Available Commands:
  browse      Browse the contents of a CSI PVC via file browser
  csicheck    Runs the CSI snapshot restore check
  fio         Runs an fio test
  help        Help about any command

Flags:
  -h, --help             help for kubestr
  -e, --outfile string   The file where test results will be written
  -o, --output string    Options(json)

Use "kubestr [command] --help" for more information about a command.

 

I am going to use the following TKC for testing.

% KUBECONFIG=gc.kubeconfig kubectl get nodes                                            
NAME                               STATUS   ROLES                  AGE    VERSION
gc-control-plane-pwngg             Ready    control-plane,master   103d   v1.20.9+vmware.1
gc-workers-wrknn-f675446b6-cz766   Ready    <none>                 103d   v1.20.9+vmware.1
gc-workers-wrknn-f675446b6-f6zqs   Ready    <none>                 103d   v1.20.9+vmware.1
gc-workers-wrknn-f675446b6-rsf6n   Ready    <none>                 103d   v1.20.9+vmware.1

 

Let's run kubestr against the cluster now.

% KUBECONFIG=gc.kubeconfig kubestr                                      

**************************************
  _  ___   _ ___ ___ ___ _____ ___
  | |/ / | | | _ ) __/ __|_   _| _ \
  | ' <| |_| | _ \ _|\__ \ | | |   /
  |_|\_\\___/|___/___|___/ |_| |_|_\

Explore your Kubernetes storage options
**************************************
Kubernetes Version Check:
  Valid kubernetes version (v1.20.9+vmware.1)  -  OK

RBAC Check:
  Kubernetes RBAC is enabled  -  OK

Aggregated Layer Check:
  The Kubernetes Aggregated Layer is enabled  -  OK

W0130 14:17:16.937556   87541 warnings.go:70] storage.k8s.io/v1beta1 CSIDriver is deprecated in v1.19+, unavailable in v1.22+; use storage.k8s.io/v1 CSIDriver
Available Storage Provisioners:

  csi.vsphere.xxxx.com:
    Can't find the CSI snapshot group api version.
    This is a CSI driver!
    (The following info may not be up to date. Please check with the provider for more information.)
    Provider:            vSphere
    Website:             https://github.com/kubernetes-sigs/vsphere-csi-driver
    Description:         A Container Storage Interface (CSI) Driver for VMware vSphere
    Additional Features: Raw Block,<br/><br/>Expansion (Block Volume),<br/><br/>Topology Aware (Block Volume)

    Storage Classes:
      * sc2-01-vc16c01-wcp-mgmt

    To perform a FIO test, run-
      ./kubestr fio -s <storage class>

 

 

You can run storage tests using kubestr and it uses FIO for generating IOs. For example this is how you can run a basic storage test.

% KUBECONFIG=gc.kubeconfig kubestr fio -s sc2-01-vc16c01-wcp-mgmt -z 10G
PVC created kubestr-fio-pvc-zvdhr
Pod created kubestr-fio-pod-kdbs5
Running FIO test (default-fio) on StorageClass (sc2-01-vc16c01-wcp-mgmt) with a PVC of Size (10G)
Elapsed time- 29.290421119s
FIO test results:
 
FIO version - fio-3.20
Global options - ioengine=libaio verify=0 direct=1 gtod_reduce=1

JobName: read_iops
  blocksize=4K filesize=2G iodepth=64 rw=randread
read:
  IOPS=3987.150391 BW(KiB/s)=15965
  iops: min=3680 max=4274 avg=3992.034424
  bw(KiB/s): min=14720 max=17096 avg=15968.827148

JobName: write_iops
  blocksize=4K filesize=2G iodepth=64 rw=randwrite
write:
  IOPS=3562.628906 BW(KiB/s)=14267
  iops: min=3237 max=3750 avg=3565.896484
  bw(KiB/s): min=12950 max=15000 avg=14264.862305

JobName: read_bw
  blocksize=128K filesize=2G iodepth=64 rw=randread
read:
  IOPS=2988.549316 BW(KiB/s)=383071
  iops: min=2756 max=3252 avg=2992.344727
  bw(KiB/s): min=352830 max=416256 avg=383056.187500

JobName: write_bw
  blocksize=128k filesize=2G iodepth=64 rw=randwrite
write:
  IOPS=2754.796143 BW(KiB/s)=353151
  iops: min=2480 max=2992 avg=2759.586182
  bw(KiB/s): min=317440 max=382976 avg=353242.781250

Disk stats (read/write):
  sdd: ios=117160/105647 merge=0/1210 ticks=2100090/2039676 in_queue=4139076, util=99.608589%
  -  OK

As you can see, a PVC of 10G, a FIO pod will be created, and this will be used for the FIO test. Once the test is complete, the PVC and FIO pod will be deleted automatically. 

I hope it was useful. Cheers!