vineethac.blogspot.com: workload management

Showing posts with label workload management. Show all posts

Friday, November 5, 2021

vSphere with Tanzu using NSX-T - Part12 - Deploy application on TKC and access it

In the previous posts we discussed the following:

Part1 - Prerequisites
Part2 - Configure NSX
Part3 - Edge Cluster
Part4 - Tier-0 Gateway and BGP peering
Part5 - Tier-1 Gateway and Segments
Part6 - Create tags, storage policy, and content library
Part7 - Enable workload management
Part8 - Create namespace and deploy Tanzu Kubernetes Cluster
Part9 - Monitoring
Part10 - Upgrade Tanzu Kubernetes Cluster
Part11 - Troubleshooting TKC

This article walks you though the steps to deploy an application on Tanzu Kubernetes Cluster (TKC) and how to access it. I will try to explain how this all works under the hood.

Here I have a TKC cluster as shown below:

% KUBECONFIG=gc.kubeconfig kg nodes
NAME                               STATUS   ROLES                  AGE   VERSION
gc-control-plane-pwngg             Ready    control-plane,master   49d   v1.20.9+vmware.1
gc-workers-wrknn-f675446b6-cz766   Ready    <none>                 49d   v1.20.9+vmware.1
gc-workers-wrknn-f675446b6-f6zqs   Ready    <none>                 49d   v1.20.9+vmware.1
gc-workers-wrknn-f675446b6-rsf6n   Ready    <none>                 49d   v1.20.9+vmware.1

% KUBECONFIG=gc.kubeconfig kg nodes -o wide
NAME                               STATUS   ROLES                  AGE   VERSION            INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                 KERNEL-VERSION       CONTAINER-RUNTIME
gc-control-plane-pwngg             Ready    control-plane,master   49d   v1.20.9+vmware.1   172.29.21.194   <none>        VMware Photon OS/Linux   4.19.191-4.ph3-esx   containerd://1.4.6
gc-workers-wrknn-f675446b6-cz766   Ready    <none>                 49d   v1.20.9+vmware.1   172.29.21.195   <none>        VMware Photon OS/Linux   4.19.191-4.ph3-esx   containerd://1.4.6
gc-workers-wrknn-f675446b6-f6zqs   Ready    <none>                 49d   v1.20.9+vmware.1   172.29.21.196   <none>        VMware Photon OS/Linux   4.19.191-4.ph3-esx   containerd://1.4.6
gc-workers-wrknn-f675446b6-rsf6n   Ready    <none>                 49d   v1.20.9+vmware.1   172.29.21.197   <none>        VMware Photon OS/Linux   4.19.191-4.ph3-esx   containerd://1.4.6

01 Create a namespace

% KUBECONFIG=gc.kubeconfig k create ns webserver
namespace/webserver created

% KUBECONFIG=gc.kubeconfig kg ns
NAME                           STATUS   AGE
default                        Active   48d
kube-node-lease                Active   48d
kube-public                    Active   48d
kube-system                    Active   48d
vmware-system-auth             Active   48d
vmware-system-cloud-provider   Active   48d
vmware-system-csi              Active   48d
webserver                      Active   10s

02 Deploy nginx application

Following is the nginx-deployment.yaml spec to deploy nginx application:

apiVersion: apps/v1
kind: Deployment
metadata:
name: my-nginx
spec:
selector:
    matchLabels:
      run: my-nginx
replicas: 2
template:
    metadata:
      labels:
        run: my-nginx
    spec:
      containers:
      - name: my-nginx
        image: nginx
        ports:
        - containerPort: 80

You can apply the yaml file as below:

% KUBECONFIG=gc.kubeconfig k apply -f nginx-deployment.yaml -n webserver
deployment.apps/my-nginx created

% KUBECONFIG=gc.kubeconfig kg deploy -n webserver
NAME       READY   UP-TO-DATE   AVAILABLE   AGE
my-nginx   0/2     0            0           3m3s

% KUBECONFIG=gc.kubeconfig kg events -n webserver
LAST SEEN   TYPE      REASON              OBJECT                           MESSAGE
26s         Warning   FailedCreate        replicaset/my-nginx-74d7c6cb98   Error creating: pods "my-nginx-74d7c6cb98-" is forbidden: PodSecurityPolicy: unable to admit pod: []
3m10s       Normal    ScalingReplicaSet   deployment/my-nginx              Scaled up replica set my-nginx-74d7c6cb98 to 2

You can see that the pods failed to get created due to PodSecurityPolicy. Following is the psp.yaml spec to create ClusterRole and ClusterRoleBinding.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: psp:privileged
rules:
- apiGroups: ['policy']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames:
- vmware-system-privileged
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: all:psp:privileged
roleRef:
kind: ClusterRole
name: psp:privileged
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: Group
name: system:serviceaccounts
apiGroup: rbac.authorization.k8s.io

Apply the yaml file as shown below:

% KUBECONFIG=gc.kubeconfig k apply -f psp.yaml
clusterrole.rbac.authorization.k8s.io/psp:privileged created
clusterrolebinding.rbac.authorization.k8s.io/all:psp:privileged created

Now, in few minutes you can see the deployment will get successful and two nginx pods will get deployed in the webserver namespace.

% KUBECONFIG=gc.kubeconfig kg deploy -n webserver
NAME READY UP-TO-DATE AVAILABLE AGE
my-nginx 2/2 2 2 80m

% KUBECONFIG=gc.kubeconfig kg pods -n webserver -o wide
NAME                        READY   STATUS    RESTARTS   AGE   IP                NODE                               NOMINATED NODE   READINESS GATES
my-nginx-74d7c6cb98-lzghr   1/1     Running   0          67m   192.168.213.132   gc-workers-wrknn-f675446b6-rsf6n   <none>           <none>
my-nginx-74d7c6cb98-s59dt   1/1     Running   0          67m   192.168.67.196    gc-workers-wrknn-f675446b6-f6zqs   <none>           <none>

03 Access the application

You can access the application in many ways depending on the usecase.

---Port-forward---

% KUBECONFIG=gc.kubeconfig kubectl port-forward deployment/my-nginx -n webserver 8080:80
Forwarding from 127.0.0.1:8080 -> 80
Forwarding from [::1]:8080 -> 80
Handling connection for 8080

The deployment is port-forwarded now. If you open another terminal and do curl localhost:8080, you can see the nginx webpage.

% curl localhost:8080
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

You can also open a web browser with http://localhost:8080/ and you will get the same nginx webpage. Well port-forwarding is fine in a local dev test scenario, but you might not want to do it in a production setup. You will need to create a service that connects the application and to access it.

Services

There are 3 types of services in Kubernetes.

NodePort: Similar to port forwarding where a port on the worker node will be forwarded to the target port of the pod where the application is running.
ClusterIP: This is useful if you want to access the application from within the cluster.
LoadBalancer: This is used to provide access to external users. In my case, NSX-T will be providing this access.

---Service NodePort---

Following is the yaml spec file for service of type nodeport:

% cat nginx-service-np.yaml
apiVersion: v1
kind: Service
metadata:
name: my-nginx
labels:
    run: my-nginx
spec:
type: NodePort
ports:
- targetPort: 80
    port: 80
    protocol: TCP
selector:
    run: my-nginx

Apply the above yaml file.

% KUBECONFIG=gc.kubeconfig k apply -f nginx-service-np.yaml -n webserver
service/my-nginx created

% KUBECONFIG=gc.kubeconfig kg svc -n webserver
NAME       TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
my-nginx   NodePort   10.111.182.155   <none>        80:30741/TCP   4s

% KUBECONFIG=gc.kubeconfig kg ep -n webserver
NAME       ENDPOINTS                              AGE
my-nginx   192.168.213.132:80,192.168.67.196:80   32m

As you can see, a service (my-nginx) of type NodePort is created. And, now the application should be accessible on port 30741 of any worker node. To verify it, first we need connectivity to the worker node IP. For connecting to worker nodes, we need to have a jumpbox pod deployed on the supervisor namespace. Once we have a jumpbox pod deployed on the sv namespace, we can ssh to TKC nodes from the jumpbox pod. You can follow my previous post to see how to create a jumpbox pod. Here is the link to VMware documentation for how to SSH to TKC nodes.

% KUBECONFIG=sv.kubeconfig k exec -it jumpbox -- sh
sh-4.4#
sh-4.4# curl 172.29.21.197:30741
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
sh-4.4#

---Service ClusterIP---

Service of type ClusterIP will be accessible within the TKC. So, I will need to deploy a jumpbox pod/ test pod within the TKC and connect from there. First let me edit the svc my-nginx from NodePort to type ClusterIP.

% KUBECONFIG=gc.kubeconfig k edit svc my-nginx -n webserver
service/my-nginx edited

% KUBECONFIG=gc.kubeconfig kg svc -n webserver
NAME       TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
my-nginx   ClusterIP   10.111.182.155   <none>        80/TCP    39m

I have already deploy a pod inside the TKC. As you can see, dnsutils is the pod that is deployed in the default namespace. We will connect to this pod and from there we can curl to the Cluster-IP of my-nginx service.

% KUBECONFIG=gc.kubeconfig kg pods
NAME       READY   STATUS    RESTARTS   AGE
dnsutils   1/1     Running   1          105m

% KUBECONFIG=gc.kubeconfig k exec -it dnsutils -- sh
#
# curl 10.111.182.155:80
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
#

Note: This service of type ClusterIP can be accessed only within the TKC, and not externally!

---Service LoadBalancer---

This is the way to expose your service to external users. In this case NSX-T will provide the external IP which will then internally forwarded to nginx pods through the my-nginx service.

I have edited the service my-nginx from type ClusterIP to LoadBalancer.

% KUBECONFIG=gc.kubeconfig k edit svc my-nginx -n webserver
service/my-nginx edited

% KUBECONFIG=gc.kubeconfig kg svc -n webserver
NAME       TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
my-nginx   LoadBalancer   10.111.182.155   <pending>     80:32398/TCP   56m

% KUBECONFIG=gc.kubeconfig kg svc -n webserver
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
my-nginx LoadBalancer 10.111.182.155 10.186.148.170 80:32398/TCP 56m

You can see that now the service has got an external ip. And, the end points of the service are as shown below, which is basically the nginx pod IPs.

% KUBECONFIG=gc.kubeconfig kg ep -n webserver
NAME ENDPOINTS AGE
my-nginx 192.168.213.132:80,192.168.67.196:80 58m

% curl 10.186.148.170
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

I could also use the external IP 10.186.148.170 in a web browser to access the nginx webpage.

Now lets have a look at what is in the supervisor namespace. This TKC is created under a supervisor namespace "vineetha-test04-deploy".

% kubectl get svc -n vineetha-test04-deploy
NAME                       TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)          AGE
gc-ba320a1e3e04259514411   LoadBalancer   172.28.5.217    10.186.148.170   80:31143/TCP     40h
gc-control-plane-service   LoadBalancer   172.28.9.37     10.186.149.120   6443:31639/TCP   51d

% kubectl get ep -n vineetha-test04-deploy
NAME                       ENDPOINTS                                                     AGE
gc-ba320a1e3e04259514411   172.29.21.195:32398,172.29.21.196:32398,172.29.21.197:32398   40h
gc-control-plane-service   172.29.21.194:6443                                            51d

So what you are seeing is, for a service of type loadbalancer created inside the TKC, a service of type loadbalancer (gc-ba320a1e3e04259514411) will be automatically created under the supervisor namespace, and the its endpoints are the IP address of TKC worker nodes.

On the NSX-T side you can see the LB for my supervisor namespace, virtual servers in it, and server pool members in the virtual server.

I hope it was useful. Cheers!

Sunday, May 30, 2021

vSphere with Tanzu using NSX-T - Part8 - Create namespace and deploy Tanzu Kubernetes Cluster

In the previous posts we discussed the following:

vSphere with Tanzu using NSX-T - Part1 - Prerequisites

vSphere with Tanzu using NSX-T - Part2 - Configure NSX

vSphere with Tanzu using NSX-T - Part3 - Edge Cluster

vSphere with Tanzu using NSX-T - Part4 - Tier-0 Gateway and BGP peering

vSphere with Tanzu using NSX-T - Part5 - Tier-1 Gateway and Segments

vSphere with Tanzu using NSX-T - Part6 - Create tags, storage policy, and content library

vSphere with Tanzu using NSX-T - Part7 - Enable workload management

Now that we have enabled workload management, the next step is to create namespaces on the supervisor cluster, set resource quotas as per requirements, and then the vSphere administrator can provide access to developers to these namespaces, and they can either deploy Tanzu Kubernetes clusters or VMs or vSphere pods.

Create namespace.

Select the cluster and provide a name for the namespace.

Now the namespace is created successfully. Before handing over this namespace to the developer, you can set permissions, assign storage policies, and set resource limits.

Let's have a look at the NSX-T components that are instantiated when we created a new namespace.

A new segment is now created for the newly created namespace. This segment is connected to the T1 Gateway of the supervisor cluster.

A SNAT rule is also now in place on the supervisor cluster T1 Gateway. This helps the Kubernetes objects residing in the namespace to reach the external network/ internet. It uses the egress range 192.168.72.0/24 that we provided during the workload management configuration for address translation.

We can now assign a storage policy to this newly created namespace.

Click on Add Storage and select the storage policy. In my case, I am using Tanzu Storage Policy which uses a vsanDatastore.

Let's apply some capacity and usage limits for this namespace. Click edit limits and provide the values.

Let's set user permissions to this newly created namespace. Click add permissions.

Now we are ready to hand over this new namespace to the dev user (John).

Under the first tile, you can see copy link, you can provide this link to the dev user. And he can open it in a web browser to access the CLI tools to connect to the newly created namespace.

Download and install the CLI tools. In my case, CLI tools are installed on a CentOS 7.x VM. You can also see the user John has connected to the newly created namespace using the CLI.

The user can now verify the resource limits of the namespace using kubectl.

You can see the following limits:

cpu-limit: 21.818
memory-limit: 131072Mi
storage: 500Gi

Storage is limited at 500 GB and memory at 128 GB which is very straightforward. We (vSphere admin) had set the CPU limits to 48 GHz. And here what you see is cpu-limit of this namespace is limited to 21.818 CPU cores. Just to give some more background on this calculation, the ESXi host that I am using for this study has 20 physical cores, and the total CPU capacity of a host is 44 GHz. I have 4 such ESXi hosts in the cluster. Now, the computing power of one physical core is (44/ 20) = 2.2 GHz. So, in order to limit the CPU to 48 GHz, the number of cpu core should be limited to (48/ 2.2) = 21.818.

Apply the following cluster definition yaml file to create a Tanzu Kubernetes cluster under the ns-01-dev-john namespace.

apiVersion: run.tanzu.vmware.com/v1alpha1

kind: TanzuKubernetesCluster

metadata:

namespace: ns-01-dev-john

spec:

topology:

controlPlane:

storageClass: tanzu-storage-policy

workers:

storageClass: tanzu-storage-policy

distribution:

version: v1.18.15

settings:

network:

services:

cidrBlocks: ["198.32.1.0/12"]

pods:

cidrBlocks: ["192.1.1.0/16"]

cni:

storage:

defaultClass: tanzu-storage-policy

You can see corresponding VMs in the Center UI.

Now, let's have a look at the NSX-T side.

A Tier-1 Gateway is now available with a segment linked to it.

You can see a server load balancer with one virtual server that provides access to KubeAPI (6443) of the Tanzu Kubernetes cluster that we just deployed.

You can also find a SNAT rule. This helps the Tanzu Kubernetes cluster objects to reach the external network/ internet. It uses the egress range 192.168.72.0/24 that we provided during the workload management configuration for address translation.

Note: This architecture is explained on the basis of vSphere 7 U1. In the newer versions there are changes. With vSphere 7 U1c the architecture changed from a per-TKG cluster Tier 1 Gateway model to a per-Supervisor namespace Tier 1 Gateway model. For more details, feel free to refer the blog series published by Harikrishnan T @hari5611.

In the next part we will discuss monitoring aspects of vSphere with Tanzu environment and Tanzu Kubernetes clusters. I hope this was useful. Cheers!

Saturday, April 24, 2021

vSphere with Tanzu using NSX-T Blog Series

Part1 - Prerequisites
Part2 - Configure NSX
Part3 - Edge Cluster
Part4 - Tier-0 Gateway and BGP peering
Part5 - Tier-1 Gateway and Segments
Part6 - Create tags, storage policy, and content library
Part7 - Enable workload management
Part8 - Create namespace and deploy Tanzu Kubernetes Cluster
Part9 - Monitoring
Part10 - Upgrade Tanzu Kubernetes Cluster
Part11 - Troubleshooting Tanzu Kubernetes Cluster
Part12 - Deploy application on TKC and access it
Part13 - Export WCP admin kubeconfig
Part14 - Testing TKC storage using kubestr
Part15 - Working with etcd on TKC with one control plane
Part16 - Troubleshooting content library related issues
Part17 - Troubleshooting TKC stuck at updating phase
Part18 - Troubleshooting vSphere pods with ProviderFailed status
Part19 - Troubleshooting TKC stuck at creating phase
Part20 - Safely deleting NotReady nodes from a TKC
Part21 - Pointers while upgrading the stack
Part22 - Working with NGINX Ingress Controller
Part23 - Supervisor cluster certificates expiry
Part24 - Kubernetes component certs in TKC
Part25 - Spherelet
Part26 - Jumpbox kubectl plugin to SSH to TKC node
Part27 - nullfinalizer kubectl plugin
Part28 - Create a custom VM Class
Part29 - Logging using Loki stack
Part30 - Troubleshooting inaccesssible TKC with server pool members missing in the LB VS
Part31 - Troubleshooting inaccessible TKC with expired control plane certs
Part32 - Troubleshooting BGP related issues
Part33 - Troubleshooting intermittent connection timeouts to apiserver and workloads
Part34 - CPU and Memory utilization of a supervisor cluster
Part35 - Monitoring supervisor cluster health with Python and vCenter APIs

Sunday, April 18, 2021

vSphere with Tanzu using NSX-T - Part7 - Enable workload management

In the previous posts we discussed the following:

Part1: Prerequisites

Part2: Configure NSX-T

Part3: Edge Cluster

Part4: Tier-0 Gateway and BGP peering

Part5: Tier-1 Gateway and Segments

Part6: Create tags, storage policy, and content library

We are all set to configure and enable workload management. Before stepping into the configurations I just want to give an overall picture of vSphere with Tanzu architecture and different components.

Once you enable workload management, the vSphere cluster will transform to a supervisor cluster. The supervisor cluster consists of 3 supervisor control plane VMs, and the ESXi hosts that act as worker nodes too. Now you can run traditional VMs, and containers side by side. You can run the containers as native vSphere pods directly running on the ESXi hosts, or you can deploy Tanzu Kubernetes clusters in VM form factor on the vSphere namespace and then run container workload on them.

Following are the steps to enable workload management:

Login vCenter - Menu - Workload Management.
Click Get started.
Select NSX-T and click next.

Select the cluster.

Select a size and click next.

Select the storage policy and click next.

Provide management network details and click next.

Provide workload network details and click next.

Add the content library and click next.

Click finish.

This process will take few minutes to configure and bring up the supervisor cluster. In my case, it took around 30 minutes to complete.
You can see the progress in the vCenter UI.

You can now see the supervisor control plane VMs are deployed.

Workload management is now enabled and the vSphere cluster is transformed to a supervisor cluster. Let's have a look at the objects that are automatically created in NSX-T.

You can see a T1 Gateway is now provisioned.

Multiple segments are now created corresponding to each namespace inside the supervisor control plane.

Multiple SNAT rules are also now in place for the newly created T1 Gateway, which helps the control plane Kubernetes objects residing in their corresponding namespaces to reach the external network/ internet. It uses the egress range 192.168.72.0/24 that we provided during the workload management configuration for address translation.

You can also see two load balancers attached to the T1 Gateway:

Distributed Load balancer: All services of type ClusterIP are implemented as distributed load balancer virtual servers. This is for east-west traffic.
Server load balancer: All services of type Loadbalancer are implemented as server load balancer L4 virtual servers. And all ingress is implemented as L7 virtual servers.

Under the server load balancer, you can see two virtual servers. One for the KubeAPI (6443) and the other for downloading the CLI tools (443) to access the cluster.

Note that this newly created T1 Gateway (domain-c8:6ea515f0-39da-431b-93bf-0d6a5e4a0f77) is connected to the T0 Gateway for external connectivity through BGP.

The next step is to create namespaces, and you can then create Tanzu Kubernetes clusters on it. Usually, the vSphere administrator will create namespaces for developers and provide the access so that they can either deploy TKG clusters, vSphere pods, or VMs on the respective namespace. We will cover all these in the next part.

Hope it was useful. Cheers!

Pages

Friday, November 5, 2021

01 Create a namespace

02 Deploy nginx application

03 Access the application

---Port-forward---

Services

---Service NodePort---

---Service ClusterIP---

---Service LoadBalancer---

Sunday, May 30, 2021

Saturday, April 24, 2021

Sunday, April 18, 2021