Showing posts with label disk groups. Show all posts
Showing posts with label disk groups. Show all posts

Saturday, April 18, 2020

vSAN performance benchmarking

In this article, I will explain briefly on performance benchmarking considerations, factors affecting performance, and some of the best practices. We do performance benchmarking to understand the capabilities and bottlenecks of a system. When I say system it could be a storage system, CPU, GPU, network switch, etc. Now let's consider a VMware vSAN cluster infrastructure. It includes multiple components and each of these contributes to the performance. In this case, the vSAN cluster is the solution under test. We will have to conduct performance benchmarking to understand the storage performance behavior of the cluster. When I say storage behavior it includes the IOPS, latency, and throughput that the cluster can produce under varying loads.

The goal of benchmarking
  • Identify bottlenecks
    • Hardware bottleneck
    • Software bottleneck
    • Application bottleneck
  • Compare tradeoffs
  • Manage expectations
  • Make decisions

Usually in a real-world scenario, benchmarking will be done once the cluster is deployed/ ready and before starting to host production workload on top of it. As these benchmark values define the performance maximums it will be helpful to decide on when to scale or upgrade the cluster before it hits a bottleneck.

Fundamental factors of vSAN performance

Server hardware
  • Compatibility as per vSAN HCL
Host
  • Number of hosts in the cluster
  • Power settings
  • CPU - number of cores and frequency 
Storage
  • Hybrid or All-flash
  • NVMe, SAS, or SATA
  • Number of disk groups per host
  • Storage controller configuration
  • Compatibility of hardware devices as per vSAN HCL
Network
  • 10/ 25/ 40 GbE
  • MTU 
  • LAG
SPBM policy
  • FTT (Failures To Tolerate)
  • FTM (Mirroring/ Erasure coding)
  • Thin or Thick provision
Security
  • Encryption
  • Checksum
Other
  • Stripe width
  • Flash read cache reservation
  • IOPS limit for object
All of the above factors will affect performance. So you should know the benefits and tradeoffs. 

Benchmarking methodology

Image credit: VMware

Storage benchmarking tools

IO load generation tools
Application-specific tools
  • HammerDB (MSSQL, Oracle)
  • Jetstress (MS Exchange)
  • SLOB (Oracle)
  • DBGen (MSSQL, Oracle)

Best practices

  • Understand the production performance metrics.
  • Test what you plan to deploy.
  • Workload modeling.
  • Plan for use case testing.
  • Choose an appropriate size for benchmarking
  • Choose the right tool.
  • Pre-allocate blocks while testing.
  • Test for a longer time duration.
  • Deploy multiple VMs with multiple VMDKs.

References

Monday, December 5, 2016

Generic Storage System Architecture


The above diagram shows a generic stand-alone storage system architecture, where a storage OS is installed over a bare metal server and thus making it a storage server. Here I am using an enterprise class storage OS named Open-E DSS V7 which is installed on a Dell PowerEdge R720xd. R720xd can have up to twelve 3.5" drives at the front plus two 2.5" drives at the rear. Here in the diagram, the last 2 disks (Disk Group03) are 2.5" drives installed at the rear and being used as OS drive in RAID1. Apart from that we have five SAS 7.2K and 15K disks that are grouped into two RAID groups. Comparing the disk IOPS 'Disk Group01' can be considered fast and 'Disk Group02' as slow.

As I haven't mentioned the size of each SAS disk, lets assume using 'Disk Group01' a 10TB RAID5 virtual disk (VD) is created and using 'Disk Group02' a 12TB RAID5 VD is created. You can configure hot spares for each disk group if you have additional disks. As I mentioned above, for OS installation we have created a 10GB RAID1 VD using 'Disk Group03'. After installation of the OS (Open-E DSS V7), it scans and shows 10TB and 12TB as available storage units. 

Now the next step is to create volume groups. Here we created two volume groups (VG00 and VG01) to differentiate fast and slow storage. 

  • VG00 uses VD01
  • VG01 uses VD02
Once volume groups are created, you can now carve out LUNs separately based on your requirements. For example, if you want a LUN that is going to be used as a datastore to store your virtual machines, then you can create it on VG00 (fast), or if you need it for storing some general purpose backup files, then you can create it on VG01 (slow). So depending on your requirement you can decide where to create your LUN.

Note: Here I classified RAID disk groups based on speed. You can divide it based on reads and writes. So that you can choose RAID10 disk group for write intensive operations and RAID5 disk group for reads. It can even be divided based on access type. That means a disk group exclusively for sequential file access (SQL logs) and another disk group for random access (SQL data, VM datastore etc).