Monday, November 30, 2015

Nutanix : a web-scale hyper converged infrastructure solution for enterprise datacenters

Nutanix is an industry leader in hyper converged infrastructure and software defined storage that is optimized for virtual workloads. You can even think it as a cluster-in-a-box solution with compute, storage and hypervisor consolidated together into a 1U or 2U enclosure. And its interesting that in a Nutanix architecture there is no RAID and no need of a SAN storage too. Storage is totally local and they are using direct attached local disks (combination of both SSD and SAS disks) for storing data.

How does it look like ?


Front and rear side of a Nutanix appliance (eg : NX-1000)


Each Nutanix box contains 4 independent nodes with are clustered together. This is shown in the figure below.

Nutanix box with 4 nodes

Each of these nodes operate independently, it has its own CPU, RAM, HDDs etc and all those nodes are clustered together. so each time you want to increase the compute and storage capacity, you can add more boxes (with 1, 2 or 4 nodes depending on the need) to the cluster. Detailed logical architecture of a single node is given below.

Single Nutanix node architecture

You can see in each node, there are SSDs as well as SAS HDDs for storage. And there is a controller VM, which is actually a virtual storage controller that runs on each and every node for improving scalability and resiliency while preventing performance bottlenecks. This controller VM is something like a VSA, but it does more than that. It is intelligent than a traditional VSA and is capable of  functionalities like automated tiering, data locality, de-duplication etc and much more. All storage controllers in a cluster communicates with each other forming Nutanix distributed file system. For each read, there are 3 levels of cache. An in-memory cache within each node, then a hot tier (SSDs) and finally cold tier (SAS HDDs). Here the hypervisor communicates with the controller VM just like it would communicate to a physical storage controller. When a write operation happens, the VM will contact the virtual storage controller and then it is written first to the local SSDs. To ensure the protection data is then replicated to multiple nodes in the cluster, so that it is always available even if a node fails. We can have RF2 (2 way replication) or RF3 (3 way replication). It is an auto healing system, so that if a node fails and if it has only one copy of data left, then the system will automatically identify it using map reduce or those type of analytics and then it will be replicated to another nodes.

If you want to add more nodes, all you have to do is to connect it to the network and power it on, the system will be auto discovered using a auto discovery protocol which runs on top of IPV6. So its very easy to add a new node to a cluster. You can dynamically expand your cluster resources by adding more boxes without shutting down the cluster. Rolling upgrades can be done with out downtime by updating the controller VM one by one in a cluster. Now, each node is clustered at the Nutanix architecture level and you can cluster it at the hypervisor level too (say, VMware ESXI cluster using vCenter server) and providing a highly available web-scale hyper converged solution.

DELL and Nutanix partnered together and they have introduced DELL XC Series appliances optimized for virtual workloads.

References :
www.nutanix.com

Saturday, November 28, 2015

Shared Nothing Live Migration

Shared Nothing Live Migration is a Hyper-V 3.0 feature that help us live migrate virtual machines from one Hyper-V server to another without a shared storage and cluster membership.
 
Note : Failover clustering provides HA, but shared nothing live migration is a mobility solution that gives flexibility in a planned movement of VMs between Hyper-V hosts without downtime.

Hyper-V settings for Live migrations 

As a prerequisite for this, we need to standardize network connectivity on Hyper-V host machines (eg : vSwitches should have same names for VM traffic, iSCSI traffic etc). And for this shared nothing Live Migration traffic we can use a separate VLAN (say, VLAN 90) so that it won’t affect local LAN.

Separate VLAN for live migration traffic


Also we need to configure constrained delegation on Hyper-V servers to use Kerberos authentication protocol when managing the servers remotely. This is shown below.

Use Kerberos

Delegation to specified services


 

Best practice recommendations for iSCSI network adapters

Best practice recommendations for iSCSI network adapters


Note : all those settings are enabled by default, we need to disable it as best practice on all iSCSI NICs

Also, if your network/ network devices supports jumbo frames, then that should be enabled too on the network adapters.


Recommended BIOS settings for DELL PowerEdge 12G servers

BIOS settings for optimal performance

Memory mode : Optimizer
Node interleave : Disabled
Logical processor : Enabled
QPI frequency : Maximum frequency
CPU power management : Maximum performance
Turbo boost : Enabled
C1E : Disabled
C-states : Disabled
Memory frequency : Maximum performance

Thursday, November 5, 2015

RAID configuration using PERC

PERC stands for PowerEdge Raid Controller. Here we have 3 physical disks present. We will be configuring 2 virtual disks (VD) of RAID 5 using these 3 physical disks.

VD00 - 100 GB
VD01 - 1.7 TB

Once the system starts press Ctrl+R to enter PERC configuration utility and follow the steps as shown below.

No configuration present and 3 disks available

Press F2 and create new VD

VD00 properties

Click OK

VD00 - 100 GB created

Press F2 and add new VD

VD01 properties

VD00 and VD01 created

 Now we have successfully created 2 VDs. Next step is to initialize both VDs.

Initialization of VD00

Start Init

Click OK

Initialization VD00 in progress

Similarly initialize the next VD too. Once its completed you can exit from the PERC utility and reboot the machine.