Showing posts with label fio. Show all posts
Showing posts with label fio. Show all posts

Sunday, January 30, 2022

vSphere with Tanzu using NSX-T - Part14 - Testing TKC storage using kubestr

In the previous posts we discussed the following:

This article is about using kubestr to test storage options of Tanzu Kubernetes Cluster (TKC). Following are the steps to install kubestr on MAC:

  • wget https://github.com/kastenhq/kubestr/releases/download/v0.4.31/kubestr_0.4.31_MacOS_amd64.tar.gz
  • tar -xvf kubestr_0.4.31_MacOS_amd64.tar.gz 
  • chmod +x kubestr
  • mv kubestr /usr/local/bin

 

Now, lets do kubestr help.

% kubestr help
kubestr is a tool that will scan your k8s cluster
        and validate that the storage systems in place as well as run
        performance tests.

Usage:
  kubestr [flags]
  kubestr [command]

Available Commands:
  browse      Browse the contents of a CSI PVC via file browser
  csicheck    Runs the CSI snapshot restore check
  fio         Runs an fio test
  help        Help about any command

Flags:
  -h, --help             help for kubestr
  -e, --outfile string   The file where test results will be written
  -o, --output string    Options(json)

Use "kubestr [command] --help" for more information about a command.

 

I am going to use the following TKC for testing.

% KUBECONFIG=gc.kubeconfig kubectl get nodes                                            
NAME                               STATUS   ROLES                  AGE    VERSION
gc-control-plane-pwngg             Ready    control-plane,master   103d   v1.20.9+vmware.1
gc-workers-wrknn-f675446b6-cz766   Ready    <none>                 103d   v1.20.9+vmware.1
gc-workers-wrknn-f675446b6-f6zqs   Ready    <none>                 103d   v1.20.9+vmware.1
gc-workers-wrknn-f675446b6-rsf6n   Ready    <none>                 103d   v1.20.9+vmware.1

 

Let's run kubestr against the cluster now.

% KUBECONFIG=gc.kubeconfig kubestr                                      

**************************************
  _  ___   _ ___ ___ ___ _____ ___
  | |/ / | | | _ ) __/ __|_   _| _ \
  | ' <| |_| | _ \ _|\__ \ | | |   /
  |_|\_\\___/|___/___|___/ |_| |_|_\

Explore your Kubernetes storage options
**************************************
Kubernetes Version Check:
  Valid kubernetes version (v1.20.9+vmware.1)  -  OK

RBAC Check:
  Kubernetes RBAC is enabled  -  OK

Aggregated Layer Check:
  The Kubernetes Aggregated Layer is enabled  -  OK

W0130 14:17:16.937556   87541 warnings.go:70] storage.k8s.io/v1beta1 CSIDriver is deprecated in v1.19+, unavailable in v1.22+; use storage.k8s.io/v1 CSIDriver
Available Storage Provisioners:

  csi.vsphere.xxxx.com:
    Can't find the CSI snapshot group api version.
    This is a CSI driver!
    (The following info may not be up to date. Please check with the provider for more information.)
    Provider:            vSphere
    Website:             https://github.com/kubernetes-sigs/vsphere-csi-driver
    Description:         A Container Storage Interface (CSI) Driver for VMware vSphere
    Additional Features: Raw Block,<br/><br/>Expansion (Block Volume),<br/><br/>Topology Aware (Block Volume)

    Storage Classes:
      * sc2-01-vc16c01-wcp-mgmt

    To perform a FIO test, run-
      ./kubestr fio -s <storage class>

 

 

You can run storage tests using kubestr and it uses FIO for generating IOs. For example this is how you can run a basic storage test.

% KUBECONFIG=gc.kubeconfig kubestr fio -s sc2-01-vc16c01-wcp-mgmt -z 10G
PVC created kubestr-fio-pvc-zvdhr
Pod created kubestr-fio-pod-kdbs5
Running FIO test (default-fio) on StorageClass (sc2-01-vc16c01-wcp-mgmt) with a PVC of Size (10G)
Elapsed time- 29.290421119s
FIO test results:
 
FIO version - fio-3.20
Global options - ioengine=libaio verify=0 direct=1 gtod_reduce=1

JobName: read_iops
  blocksize=4K filesize=2G iodepth=64 rw=randread
read:
  IOPS=3987.150391 BW(KiB/s)=15965
  iops: min=3680 max=4274 avg=3992.034424
  bw(KiB/s): min=14720 max=17096 avg=15968.827148

JobName: write_iops
  blocksize=4K filesize=2G iodepth=64 rw=randwrite
write:
  IOPS=3562.628906 BW(KiB/s)=14267
  iops: min=3237 max=3750 avg=3565.896484
  bw(KiB/s): min=12950 max=15000 avg=14264.862305

JobName: read_bw
  blocksize=128K filesize=2G iodepth=64 rw=randread
read:
  IOPS=2988.549316 BW(KiB/s)=383071
  iops: min=2756 max=3252 avg=2992.344727
  bw(KiB/s): min=352830 max=416256 avg=383056.187500

JobName: write_bw
  blocksize=128k filesize=2G iodepth=64 rw=randwrite
write:
  IOPS=2754.796143 BW(KiB/s)=353151
  iops: min=2480 max=2992 avg=2759.586182
  bw(KiB/s): min=317440 max=382976 avg=353242.781250

Disk stats (read/write):
  sdd: ios=117160/105647 merge=0/1210 ticks=2100090/2039676 in_queue=4139076, util=99.608589%
  -  OK

As you can see, a PVC of 10G, a FIO pod will be created, and this will be used for the FIO test. Once the test is complete, the PVC and FIO pod will be deleted automatically. 

I hope it was useful. Cheers!


Saturday, November 28, 2020

Storage performance benchmarking of Tanzu Kubernetes Clusters

Benchmarking of IT infrastructure is standard practice and is usually done before putting it into a production environment. It gives you baseline values about different performance aspects of the system/ solution under test. These benchmarking principles are applicable for Kubernetes clusters too. But the test cases and evaluation criteria may slightly vary compared to benchmarking a traditional IT infrastructure. 

Following are some of the test considerations:

  • Performance of PVCs.
    • Time to provision PVCs.
    • Read/ Write IOPS and Latency of PVCs.
  • Pod startup latency.
  • The time consumed to complete the deployment of different K8s objects.
    • Statefulset
    • Deployment etc.
  • Performance behavior of sample application workloads.
  • Network performance and connectivity between different K8s nodes.

In this article, I will explain a quick and easy way to benchmark the storage system used by the Kubernetes cluster to provision PVCs for application workloads. I am using FIO to generate storage IOs. You can use the following YAML file to deploy FIO pods as a statefulset. Note that here I am using PowerFlex VVOL datastore as Cloud Native Storage (CNS) for Tanzu K8s clusters and so the storage class "powerflex-storage-policy". This may differ in your case, and you might need to modify it to match the storage class available in your setup.


This YAML file will deploy a statefulset with 15 FIO pods (as per the number of replicas mentioned) and will start the storage IO stress test (8k block size, 70% random reads, 30% random writes, 2 jobs, 16 iodepth) on the attached PVC as and when the pod is started. Total 15 PVCs will be created in this case, and one PVC will get attached to one FIO pod. 

Note: If you get an error "forbidden: unable to validate against any pod security policy" after applying the above statefulset, then the pods will not get created. You will need to first create and apply Pod Security Policy (PSP) to the Tanzu Kubernetes Cluster.


Following is an overview of my vSphere with Tanzu setup:

Tanzu K8s control plane nodes/ master VMs: 3
Tanzu K8s worker nodes/ VMs: 15


Contexts, Tanzu K8s cluster nodes, and storage class.


Create a statefulset using the above YAML file.
kubectl apply -f https://gist.githubusercontent.com/vineethac/7c9f6ce2b72868b8832a4404b79ebba2/raw/980f9d6c24c10b1b7b39b20d80c15a9f2ee6c4f1/fio_ss.yaml -n <namespace name>


You can see that it took roughly 6 minutes to deploy 15 FIO pods and corresponding PVCs. The time may vary depending on whether the FIO image is locally available on the nodes, available resources on the nodes, etc.  


As and when each pod is created, FIO will automatically start IO stress on it. IOs will be read/ written into the attached PVCs. As I mentioned earlier, I am using a storage class "powerflex-storage-policy" and this is associated with a VVOL datastore backed by a PowerFlex storage pool. In this case, all the PVCs are created in a PowerFlex VVOL datastore.


You can also see multiple volumes in the PowerFlex UI and all those volume names starting with "vasa" are externally managed by the PowerFlex VASA provider. The performance of each volume can be also be monitored using the PowerFlex UI.


If you would like to see the historical performance data, you can use vROps. Dell EMC has recently released a vROps management pack for PowerFlex systems. It is a monitoring and alerting solution that provides extensive visibility into the PowerFlex infrastructure. For monitoring K8s clusters and resources, you can use the vROps management pack for container monitoring


Note: When the duration mentioned in the FIO test is over, the pods will get restarted and the IO stress will also start. To modify the FIO parameters you can use kubectl edit statefulset fiopod-statefulset-multipod -n fiogit modify required parameters and save it. After saving it the new changes will get applied automatically. Once you are done with the testing, you can delete the statefulset and the corresponding PVCs using kubectl delete command. This method is useful when you want to test something quickly or if you have only less test profiles. If you have many test profiles with varying block sizes, iodepth, etc, then you will need to build a small script or something to automate the process. 

Hope it was useful. Cheers!


Related articles


References


Thursday, July 9, 2020

Tanzu Kubernetes Grid (TKG) on vSphere 6.7 U3 - Part3

In this blog, I will explain how to deploy an FIO application pod with persistent storage on your Tanzu Kubernetes workload cluster.

Step 1: Deploy a K8s workload cluster

tkg create cluster <cluster name> --plan=dev


Now the workload K8s cluster is deployed with a Master, LB, and Worker node.


Saturday, April 18, 2020

vSAN performance benchmarking

In this article, I will explain briefly on performance benchmarking considerations, factors affecting performance, and some of the best practices. We do performance benchmarking to understand the capabilities and bottlenecks of a system. When I say system it could be a storage system, CPU, GPU, network switch, etc. Now let's consider a VMware vSAN cluster infrastructure. It includes multiple components and each of these contributes to the performance. In this case, the vSAN cluster is the solution under test. We will have to conduct performance benchmarking to understand the storage performance behavior of the cluster. When I say storage behavior it includes the IOPS, latency, and throughput that the cluster can produce under varying loads.

The goal of benchmarking
  • Identify bottlenecks
    • Hardware bottleneck
    • Software bottleneck
    • Application bottleneck
  • Compare tradeoffs
  • Manage expectations
  • Make decisions

Usually in a real-world scenario, benchmarking will be done once the cluster is deployed/ ready and before starting to host production workload on top of it. As these benchmark values define the performance maximums it will be helpful to decide on when to scale or upgrade the cluster before it hits a bottleneck.

Fundamental factors of vSAN performance

Server hardware
  • Compatibility as per vSAN HCL
Host
  • Number of hosts in the cluster
  • Power settings
  • CPU - number of cores and frequency 
Storage
  • Hybrid or All-flash
  • NVMe, SAS, or SATA
  • Number of disk groups per host
  • Storage controller configuration
  • Compatibility of hardware devices as per vSAN HCL
Network
  • 10/ 25/ 40 GbE
  • MTU 
  • LAG
SPBM policy
  • FTT (Failures To Tolerate)
  • FTM (Mirroring/ Erasure coding)
  • Thin or Thick provision
Security
  • Encryption
  • Checksum
Other
  • Stripe width
  • Flash read cache reservation
  • IOPS limit for object
All of the above factors will affect performance. So you should know the benefits and tradeoffs. 

Benchmarking methodology

Image credit: VMware

Storage benchmarking tools

IO load generation tools
Application-specific tools
  • HammerDB (MSSQL, Oracle)
  • Jetstress (MS Exchange)
  • SLOB (Oracle)
  • DBGen (MSSQL, Oracle)

Best practices

  • Understand the production performance metrics.
  • Test what you plan to deploy.
  • Workload modeling.
  • Plan for use case testing.
  • Choose an appropriate size for benchmarking
  • Choose the right tool.
  • Pre-allocate blocks while testing.
  • Test for a longer time duration.
  • Deploy multiple VMs with multiple VMDKs.

References