Showing posts with label storage virtualization. Show all posts
Showing posts with label storage virtualization. Show all posts

Saturday, June 30, 2018

Introduction to Nutanix cluster components

In this article I will briefly explain about the different components of a Nutanix cluster. The major components are listed below.

Nutanix cluster components
  1. Stargate: Data I/O manager for the cluster.
  2. Medusa: Access interface for Cassandra.
  3. Cassandra: Distributed metadata store.
  4. Curator: Handles Map Reduce cluster management and cleanup.
  5. Zookeeper: Manages cluster configuration.
  6. Zeus: Access interface for Zookeeper.
  7. Prism: Management interface for Nutanix UI, nCLI and APIs.
Stargate
  • Responsible for all data management and I/O operations.
  • It is the main point of contact for a Nutanix cluster.
  • Workflow: Read/ write from VM < > Hypervisor < > Stargate.
  • Stargate works closely with Curator to ensure data is protected and optimized.
  • It also depends on Medusa to gather metadata and Zeus to gather cluster configuration data.
Medusa
  • Medusa is the Nutanix abstraction layer that sits infront of DB that holds the cluster metadata.
  • Stargate and Curator communicates to Cassandra through Medusa.
Cassandra
  • It is a distributed high performance and scalable DB.
  • It stores all metadata about all VMs stored in a Nutanix datastore.
  • It needs verification of atleast one other Cassandra node to commit its operations.
  • Cassandra depends on Zeus for cluster configuration.
Curator
  • Curator constantly access the environment and is responsible for managing and distributing data throughout the cluster.
  • It does disk balancing and information life cycle management.
  • It is elected by a Curator master node who manages the task and job delegation.
  • Master node coordinates periodic scans of the metadata DB and identifies cleanup and optimization tasks tat Stargate or other components should perform.
  • It is also responsible for analyzing the metadata, this is shared across all Curator nodes using a Map Reduce algorithm. 
Zookeeper
  • It runs on 3 nodes in the cluster.
  • It can be increased to 5 nodes of the cluster.
  • Zookeeper coordinates and distributes services.
  • One is elected as leader.
  • All Zookeeper nodes can process reads.
  • Leader is responsible for cluster configuration write requests and forwards to its peers.
  • If leader fails to respond, a new leader is elected.
Zeus
  • Zeus is the Nutanix library interface which all other components use to access cluster configuration information.
  • It is responsible for cluster configuration and leadership logs.
  • If Zeus goes down, all goes down!
Prism
  • Prism is the central entity of viewing  activity inside the cluster.
  • It is the management gateway for administrators to configure and monitor a Nutanix cluster.
  • It also elects a node.
  • Prism depends on data stored in Zookeeper and Cassandra.

Note: All the info provided above are based on Nutanix 4.5 Platform Professional (NPP) administration course.

Tuesday, December 27, 2016

Highly Available SOFS On Clustered Storage Spaces

This article explains briefly about the design and steps required to deploy a highly available scale-out-file server (SOFS) on clustered storage spaces using Windows Server 2012 R2. 

Note: Here we are using only a single JBOD. But there are several storage spaces certified JBODs available in market (eg: DATAON) which are enclosure aware. That means when you are using multiple JBODs together, you will have data redundancy at the JBOD level too. 


  1. JBOD is connected to both file servers using shared SAS HBA connectors
  2. MPIO is enabled on both servers (SERVER 01/ 02)
  3. Make sure JBOD disks are available to both servers
  4. In this case we have total 24 disks (6 SSD and 18 SAS HDD)
  5. Create new storage pool
    1. By default all available disks are included in a pool named the primodial pool
    2. If primodial pool is not listed under storage spaces, then it indicates that the storage doesn't meet requirements of storage spaces
    3. If you select primodial pool, all available disks will be listed under physical disks
    4. Select option create new storage pool
    5. Give it a name
    6. Select physical disks you want to be in the pool
    7. Hot spares can be configured too at this stage
  6. Verify new storage pool is listed under storage pools
  7. Next step is to create virtual disks (these are not vhdx files, they call called spaces)
    1. Now select the new storage pool that you have just created in step 5
    2. Click on tasks, select new virtual disk
    3. Give a name
    4. Select the storage layout (Mirror, Parity, Simple)
    5. Choose provisioning type (Thin, Fixed)
    6. Specify size
  8. Now create volumes
    1. Right click the virtual disk (space) that you have just created in the previous step and select new volume
    2. Select server name and then the virtual disk name
    3. Specify volume size, drive letter, file system type, allocation unit size and volume label 
  9. Create failover cluster using the 2 file servers
    1. Provide a cluster name
    2. Volumes that you created in step 8 will be listed as available volumes
    3.  Add those as cluster shared volumes
    4. Now it appears in C:\ClusterStorage\
  10. Next step is to create a highly available SOFS
    1. On failover cluster manager - roles - new clustered role - file server - file server for scale out application data (SOFS) 
    2. Provide client access point name (eg: SMB-FS01)
    3. Right click on SOFS role in failover cluster manager and select add shared folder
    4. Choose SMB share server applications
    5. Provide a name (eg: DATA01)
    6. Local path to share (C:\ClusterStorage\volume1\shares\DATA01
    7. Remote path to share (\\SMB-FS01\DATA01)
Now you have a highly available file share (\\SMB-FS01\DATA01) to store your virtual machine files which is built over clustered storage spaces.

Reference: Microsoft 

Monday, December 5, 2016

Generic Storage System Architecture


The above diagram shows a generic stand-alone storage system architecture, where a storage OS is installed over a bare metal server and thus making it a storage server. Here I am using an enterprise class storage OS named Open-E DSS V7 which is installed on a Dell PowerEdge R720xd. R720xd can have up to twelve 3.5" drives at the front plus two 2.5" drives at the rear. Here in the diagram, the last 2 disks (Disk Group03) are 2.5" drives installed at the rear and being used as OS drive in RAID1. Apart from that we have five SAS 7.2K and 15K disks that are grouped into two RAID groups. Comparing the disk IOPS 'Disk Group01' can be considered fast and 'Disk Group02' as slow.

As I haven't mentioned the size of each SAS disk, lets assume using 'Disk Group01' a 10TB RAID5 virtual disk (VD) is created and using 'Disk Group02' a 12TB RAID5 VD is created. You can configure hot spares for each disk group if you have additional disks. As I mentioned above, for OS installation we have created a 10GB RAID1 VD using 'Disk Group03'. After installation of the OS (Open-E DSS V7), it scans and shows 10TB and 12TB as available storage units. 

Now the next step is to create volume groups. Here we created two volume groups (VG00 and VG01) to differentiate fast and slow storage. 

  • VG00 uses VD01
  • VG01 uses VD02
Once volume groups are created, you can now carve out LUNs separately based on your requirements. For example, if you want a LUN that is going to be used as a datastore to store your virtual machines, then you can create it on VG00 (fast), or if you need it for storing some general purpose backup files, then you can create it on VG01 (slow). So depending on your requirement you can decide where to create your LUN.

Note: Here I classified RAID disk groups based on speed. You can divide it based on reads and writes. So that you can choose RAID10 disk group for write intensive operations and RAID5 disk group for reads. It can even be divided based on access type. That means a disk group exclusively for sequential file access (SQL logs) and another disk group for random access (SQL data, VM datastore etc).