Thursday, November 30, 2017

Software Defined Storage using ScaleIO

In this article I will explain briefly about ScaleIO and various options that are available to deploy ScaleIO software defined storage (SDS) solution. 

ScaleIO can be considered as a very good option for customers who are moving towards deploying software defined storage  solutions and hyperconverged infrastructure. As ScaleIO software supports multiple hypervisors and operating systems like VMware ESXi, Hyper-V, RHEL, Windows etc. customers with a heterogeneous IT infrastructure gets the most benefit out of it. Apart from that it offers multiple deployment modes like hyperconverged, two layer and mixed mode. I am sure most of you are very much familiar with the term hyperconverged where compute and storage runs together on the same box. You can scale both compute and storage resources together by adding more and more nodes to your cluster. A two layer mode is nothing but a storage only configuration where you can scale the storage resources separately. It is essentially a virtual SAN infrastructure implemented using ScaleIO SDS. A mixed mode scenario will usually occur when transitioning from storage only configuration to hyperconverged.

Now I will just give an overview on how to deploy ScaleIO on VMware and RHEL platforms. ScaleIO has tight integration with VMware and they provide a powershell script and vCenter plugin to simplify the deployment. In case of RHEL platform, you can use Installation Manager (IM) which is a part of ScaleIO Gateway for quick and easy deployment of ScaleIO cluster. Customers have multiple options to consume ScaleIO. They can just buy the ScaleIO software alone and use commodity x86 hardware to build the cluster (not a great idea for production deployments as they have to figure out and use the validated/ qualified hardware and software components to ensure seamless operation and proper support) or they can buy ScaleIO Ready Nodes which are prevalidated, preconfigured and optimized PowerEdge servers to deploy ScaleIO cluster. Apart from that there is another offering VxRack System Flex which is a rack-scale hyperconverged solution built on Dell EMC PowerEdge servers with integrated Cisco networking and ScaleIO software. 

Lets have a look at the major components of ScaleIO. Below figure shows a 5 node hyperconverged ScaleIO cluster running on a highly available VMware platform. The three main components of ScaleIO are:

  • SDC - ScaleIO Data Client
  • SDS - ScaleIO Data Server
  • MDM - Meta Data Manager


In this scenario, all 5 nodes have ESXi installed and clustered. All nodes have local hard disks present in them. And its the responsibility of ScaleIO software to pool all the hard disks from all 5 nodes forming a distributed virtual SAN.

SDC is a light weight driver which is responsible for presenting LUNs provisioned from the ScaleIO system. SDS is responsible for managing local disks present in each node. MDM contains all the metadata required for system operation and configuration changes. It manages the metadata, SDC, SDS, system capacity, device mappings, volumes, data protection, errors/ failures, rebuild and rebalance operations etc. ScaleIO supports 3 node/ 5 node MDM cluster. Above figure shows a 5 node MDM cluster, where there will be 3 manager MDMs and out of which one will be master and two will be slaves and there will be two Tie-Breaker (TB) which helps in deciding master MDM by maintaining a majority in the cluster. In a production environment with 5 or more nodes, it is recommended to use a 5 node MDM cluster as it can tolerate 2 MDM failures.

ScaleIO uses a distributed two way mesh mirror scheme to protect data against disk or node failures. To ensure QoS it has the capability where you can limit bandwidth as well as IOPS for each volume provisioned from a ScaleIO cluster. And regarding scalability a single ScaleIO cluster supports upto 1024 nodes. In very large ScaleIO deployments it is highly recommended to configure separate protection domains and fault sets to minimize the impact of multiple failures at the same time. 

You can download ScaleIO software for free to test and play around in your lab environment.

References:
Dell EMC ScaleIO Basic Architecture
Dell EMC ScaleIO Design Considerations And Best Practices
Dell EMC ScaleIO Ready Node