Benchmarking of IT infrastructure is standard practice and is usually done before putting it into a production environment. It gives you baseline values about different performance aspects of the system/ solution under test. These benchmarking principles are applicable for Kubernetes clusters too. But the test cases and evaluation criteria may slightly vary compared to benchmarking a traditional IT infrastructure.
Following are some of the test considerations:
Performance of PVCs.
Time to provision PVCs.
Read/ Write IOPS and Latency of PVCs.
Pod startup latency.
The time consumed to complete the deployment of different K8s objects.
Statefulset
Deployment etc.
Performance behavior of sample application workloads.
Network performance and connectivity between different K8s nodes.
In this article, I will explain a quick and easy way to benchmark the storage system used by the Kubernetes cluster to provision PVCs for application workloads. I am using FIO to generate storage IOs. You can use the following YAML file to deploy FIO pods as a statefulset. Note that here I am using PowerFlex VVOL datastore as Cloud Native Storage (CNS) for Tanzu K8s clusters and so the storage class "powerflex-storage-policy". This may differ in your case, and you might need to modify it to match the storage class available in your setup.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This YAML file will deploy a statefulset with 15 FIO pods (as per the number of replicas mentioned) and will start the storage IO stress test (8k block size, 70% random reads, 30% random writes, 2 jobs, 16 iodepth) on the attached PVC as and when the pod is started. Total 15 PVCs will be created in this case, and one PVC will get attached to one FIO pod.
Note: If you get an error "forbidden: unable to validate against any pod security policy" after applying the above statefulset, then the pods will not get created. You will need to first create and apply Pod Security Policy (PSP) to the Tanzu Kubernetes Cluster.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
You can see that it took roughly 6 minutes to deploy 15 FIO pods and corresponding PVCs. The time may vary depending on whether the FIO image is locally available on the nodes, available resources on the nodes, etc.
As and when each pod is created, FIO will automatically start IO stress on it. IOs will be read/ written into the attached PVCs. As I mentioned earlier, I am using a storage class "powerflex-storage-policy" and this is associated with a VVOL datastore backed by a PowerFlex storage pool. In this case, all the PVCs are created in a PowerFlex VVOL datastore.
You can also see multiple volumes in the PowerFlex UI and all those volume names starting with "vasa" are externally managed by the PowerFlex VASA provider. The performance of each volume can be also be monitored using the PowerFlex UI.
If you would like to see the historical performance data, you can use vROps. Dell EMC has recently released a vROps management pack for PowerFlex systems. It is a monitoring and alerting solution that provides extensive visibility into the PowerFlex infrastructure. For monitoring K8s clusters and resources, you can use the vROps management pack for container monitoring.
Note: When the duration mentioned in the FIO test is over, the pods will get restarted and the IO stress will also start. To modify the FIO parameters you can use kubectl edit statefulset fiopod-statefulset-multipod -n fiogit modify required parameters and save it. After saving it the new changes will get applied automatically. Once you are done with the testing, you can delete the statefulset and the corresponding PVCs using kubectl delete command. This method is useful when you want to test something quickly or if you have only less test profiles. If you have many test profiles with varying block sizes, iodepth, etc, then you will need to build a small script or something to automate the process.
In this post, we will take a look at the different resource kinds that are part of the Dell EMC PowerFlex Management Pack. Following is a very high-level logical representation of the PowerFlex Adapter resource kinds and their relationships:
Go to Environment - All objects - PowerFlex Adapter
You can also get a PowerFlex system level view in vROps using the PowerFlex rack/ appliance system resource kind. This system view is making use of the system name field that we provided while configuring each PowerFlex Adapter instance type. The system name is used to group all the logical components of one PowerFlex system.
This view provides end-to-end visibility of the PowerFlex infrastructure components that will be useful to understand the relationship between different layers of the stack. This will be also helpful to identify and troubleshoot in case of issues.
We have covered the installation and configuration of the PowerFlex Management Pack in the previous posts. In this post, we will have a look at the different dashboards that are part of the MP. Following are the 13 dashboards you will get after installing the MP:
Overview
PowerFlex System Overview
PowerFlex Manager
PowerFlex Manager Details
Management Controller
PowerFlex Management Controller
Compute
PowerFlex ESXi Cluster Usage
PowerFlex ESXi Host Usage
PowerFlex SVM Utilization
Networking
PowerFlex Networking Environment
PowerFlex Networking Performance
Storage
PowerFlex Summary
PowerFlex Details
PowerFlex Replication Details
Server Hardware
PowerFlex Node Summary
PowerFlex Node Details
Now, let's have a quick look at some of these dashboards and their functionality.
PowerFlex Node Summary
This dashboard shows the health of all PowerFlex nodes being monitored by the MP. You can see the classification of nodes as Compute Only, Storage Only, Hyperconverged, and Management Controller along with a relationship between a node and its corresponding hardware components.
PowerFlex Summary
This dashboard shows the health status of all the logical components of the PowerFlex storage system. It also has a parent-child relationship between different objects of the storage system. You can also see widgets for capacity usage trend forecasting, alerts, top storage pools by capacity usage, top volumes by size, etc.
PowerFlex Details
This dashboard shows all PowerFlex storage performance KPIs like IOPS, Bandwidth, Latency, etc.
PowerFlex Networking Environment
You can see the health status of Cisco networking components and the relationship between network interfaces, nodes, switch ports, VLANs, port-channels, etc.
PowerFlex Networking Performance
This dashboard shows the switch and switch port KPIs like Throughout, Errors, Packet discards, etc.
PowerFlex Manager
You can see the service deployment details like service health, RCM compliance status, deployment status, etc. in this dashboard.
Before getting into the configuration, I would like to provide a high-level view of my lab setup. I have two separate PowerFlex rack systems that I will be monitoring using the management pack. The two systems are named RAMS and VIKINGS and have the following components.
The PowerFlex Management Pack supports the following 4 instance types:
PowerFlex Networking - queries and collects networking details from Cisco switches
PowerFlex Gateway - queries and collects storage details from PowerFlex Gateway
PowerFlex Nodes - queries and collects server hardware health details from iDRACs
PowerFlex Manager - queries and collects service deployment details from PowerFlex Manager
Note: The default collection interval for all PowerFlex Adapter instance types is set to 5 minutes.
I have already configured the controller VCSA and customer VCSA of both (RAMS and VIKINGS) clusters as shown below. This makes use of the native vSphere Adapter and vSAN Adapter present in vROps.
Now we can start adding required accounts for the PowerFlex Adapter to connect to the different REST endpoints.
PowerFlex Networking
Click add account.
Select the PowerFlex Adapter.
Let's configure the account for monitoring Cisco TOR switches of the RAMS cluster.
Provide the following details:
Name
Management IP address of Cisco TOR switches
Select the instance type as "PowerFlex Networking" and provide a system name. In this case, these TOR switches are part of RAMS. So I have given the system name as RAMS.
Add a new credential. Select the credential kind as "PowerFlex Networking Adapter Credentials".
Provide a credential name, username and password. Click OK.
Click VALIDATE CONNECTION.
If everything is fine, you will get a test connection successful message. Click OK.
Click ADD to save the account. You will see the account we just created under the other accounts page. Initially, the status will be warning but it will turn to OK in few seconds.
Note: In the product guide it is recommended to configure not more than 40 Cisco switches in one PowerFlex Networking instance. So, if you have 80 switches in your PowerFlex system, you will need to configure 2 PowerFlex Networking instances where each instance will connect/ query/ collect details from 40 switches.
PowerFlex Gateway
PowerFlex Nodes
Make sure to provide the PowerFlex Management Controller vCenter details in the advanced settings. If you have configured the native adapter with vCenter IP address, then you have to provide the IP address in the advanced settings. In this case, I have configured the native adapter with the vCenter hostname/ FQDN, so in the advanced settings, I have provided the same FQDN. This field will be used to identify and classify the PowerFlex Management Controller nodes.
Note: In the product guide it is recommended to configure 30 iDRACs or less in one PowerFlex Node instance. So, if you have 120 nodes in your PowerFlex system, you will need to configure 4 PowerFlex Node instances where each instance will connect/ query/ collect details from 30 iDRACs.
PowerFlex Manager
Note: While adding the credentials for the PowerFlex Manager, it is mandatory to provide the PowerFlex Manager Domain Name. VXFMLOCAL is the domain name for the default admin user.
Verify the status of all accounts.
Now we have finished creating all the required accounts to monitor the RAMS system. Similarly, you can add multiple PowerFlex systems and monitor them using the management pack. In my case, I have one more PowerFlex system named VIKINGS and I have added all the required accounts as given in the following screenshot. As you can see below, for the VIKINGS system I have configured seperate instances for CO, SO, and Controller nodes. This is because the iDRAC credentials for CO, SO, and Controller nodes are different.
In the dashboards section, you can see all the 13 dashboards. Depending on the number of components/ size of the PowerFlex system, it may take 15-20 minutes for the data to get populated in the respective dashboards.
In the next part, we will go through the different dashboards and other capabilities of the management pack. Hope it was useful. Cheers!
Dell EMC has recently released a vROps management pack for PowerFlex. It is a monitoring and alerting solution that provides extensive visibility into PowerFlex systems using vROps. The management pack collects key metrics for PowerFlex storage, networking, compute, and server hardware and ingests into vROps. The solution is available to all PowerFlex rack and appliance customers free of cost. This brings additional value to the IT operations and life cycle management functionality delivered by PowerFlex Manager.
Now, let's start with installation of the management pack. The steps are same for vROps 8.0, 8.1, and 8.2.