Tuesday, December 27, 2016

Highly Available SOFS On Clustered Storage Spaces

This article explains briefly about the design and steps required to deploy a highly available scale-out-file server (SOFS) on clustered storage spaces using Windows Server 2012 R2. 

Note: Here we are using only a single JBOD. But there are several storage spaces certified JBODs available in market (eg: DATAON) which are enclosure aware. That means when you are using multiple JBODs together, you will have data redundancy at the JBOD level too. 


  1. JBOD is connected to both file servers using shared SAS HBA connectors
  2. MPIO is enabled on both servers (SERVER 01/ 02)
  3. Make sure JBOD disks are available to both servers
  4. In this case we have total 24 disks (6 SSD and 18 SAS HDD)
  5. Create new storage pool
    1. By default all available disks are included in a pool named the primodial pool
    2. If primodial pool is not listed under storage spaces, then it indicates that the storage doesn't meet requirements of storage spaces
    3. If you select primodial pool, all available disks will be listed under physical disks
    4. Select option create new storage pool
    5. Give it a name
    6. Select physical disks you want to be in the pool
    7. Hot spares can be configured too at this stage
  6. Verify new storage pool is listed under storage pools
  7. Next step is to create virtual disks (these are not vhdx files, they call called spaces)
    1. Now select the new storage pool that you have just created in step 5
    2. Click on tasks, select new virtual disk
    3. Give a name
    4. Select the storage layout (Mirror, Parity, Simple)
    5. Choose provisioning type (Thin, Fixed)
    6. Specify size
  8. Now create volumes
    1. Right click the virtual disk (space) that you have just created in the previous step and select new volume
    2. Select server name and then the virtual disk name
    3. Specify volume size, drive letter, file system type, allocation unit size and volume label 
  9. Create failover cluster using the 2 file servers
    1. Provide a cluster name
    2. Volumes that you created in step 8 will be listed as available volumes
    3.  Add those as cluster shared volumes
    4. Now it appears in C:\ClusterStorage\
  10. Next step is to create a highly available SOFS
    1. On failover cluster manager - roles - new clustered role - file server - file server for scale out application data (SOFS) 
    2. Provide client access point name (eg: SMB-FS01)
    3. Right click on SOFS role in failover cluster manager and select add shared folder
    4. Choose SMB share server applications
    5. Provide a name (eg: DATA01)
    6. Local path to share (C:\ClusterStorage\volume1\shares\DATA01
    7. Remote path to share (\\SMB-FS01\DATA01)
Now you have a highly available file share (\\SMB-FS01\DATA01) to store your virtual machine files which is built over clustered storage spaces.

Reference: Microsoft 

Monday, December 5, 2016

Generic Storage System Architecture


The above diagram shows a generic stand-alone storage system architecture, where a storage OS is installed over a bare metal server and thus making it a storage server. Here I am using an enterprise class storage OS named Open-E DSS V7 which is installed on a Dell PowerEdge R720xd. R720xd can have up to twelve 3.5" drives at the front plus two 2.5" drives at the rear. Here in the diagram, the last 2 disks (Disk Group03) are 2.5" drives installed at the rear and being used as OS drive in RAID1. Apart from that we have five SAS 7.2K and 15K disks that are grouped into two RAID groups. Comparing the disk IOPS 'Disk Group01' can be considered fast and 'Disk Group02' as slow.

As I haven't mentioned the size of each SAS disk, lets assume using 'Disk Group01' a 10TB RAID5 virtual disk (VD) is created and using 'Disk Group02' a 12TB RAID5 VD is created. You can configure hot spares for each disk group if you have additional disks. As I mentioned above, for OS installation we have created a 10GB RAID1 VD using 'Disk Group03'. After installation of the OS (Open-E DSS V7), it scans and shows 10TB and 12TB as available storage units. 

Now the next step is to create volume groups. Here we created two volume groups (VG00 and VG01) to differentiate fast and slow storage. 

  • VG00 uses VD01
  • VG01 uses VD02
Once volume groups are created, you can now carve out LUNs separately based on your requirements. For example, if you want a LUN that is going to be used as a datastore to store your virtual machines, then you can create it on VG00 (fast), or if you need it for storing some general purpose backup files, then you can create it on VG01 (slow). So depending on your requirement you can decide where to create your LUN.

Note: Here I classified RAID disk groups based on speed. You can divide it based on reads and writes. So that you can choose RAID10 disk group for write intensive operations and RAID5 disk group for reads. It can even be divided based on access type. That means a disk group exclusively for sequential file access (SQL logs) and another disk group for random access (SQL data, VM datastore etc). 

Monday, November 14, 2016

Best practices while virtualizing Microsoft SQL servers using Hyper-V

-Limit min and max memory for SQL server
-Use fixed size VHDX
-Split data and log files into separate VHDX disks
-Use multiple SCSI controllers
-Right sizing and not over allocating resources
-Making use of multiple RAID disk groups (for sequential and random access)
-For tier-1 mission critical applications use RAID 10 for data, log files, and tempDB for best performance and availability
-For lower tier SQL workloads when cost is a concern, data and tempDB can be on RAID 5
-Use of SSDs or tiered storage for higher IOPS
-If using VMQ on Hyper-V environment, on the guest OS side you can enable vRSS for processing network load across multiple CPUs
-I prefer using fixed size memory for SQL VMs
-Exclude SQL DB related files  from (*.mdf, *.ldf, *.ndf, *.bak, *.trn) on access antivirus scan
-Disable content indexing on SQL data/ log/ tempDB drives
-Enable lock pages in memory (group policy setting)
-DB IFI (Instant File Initialization)
-Set OS power plan to high performance
-OS performance options - visual effects - adjust for best performance

Wednesday, October 26, 2016

3 Node Hyper-V 2012 R2 Cluster Design

Below diagram shows a traditional 3 tier highly available cluster with 3 Hyper-V nodes and 2 shared storage nodes (storage in Active-Passive/ Active-Active mode) all connected via network switches.

Compute

*Hyper-V nodes are running on DELL PowerEdge R630 rack servers

Networking

*Shared storage is accessed via MPIO over two separate VLANs (61 and 62)
*VM traffic is over NIC teaming (Switch independent and dynamic)
*Live migration/ cluster network is also teamed together

Storage

*We are using DELL PowerEdge R320 with Open-E DSS V7 as storage nodes
*Each node has 8 x 10K SAS drives with RAID 5
*Storage can be either in Active-Passive/ Active-Active cluster mode
*In Active-Passive mode, only one of the storage nodes will be active. That means resources of only one storage node will be utilized at a time
*In Active-Active mode, both servers will be active and serving storage traffic 


Saturday, October 15, 2016

Powershell script to monitor Azure AD Sync Scheduler status

Following powershell script will hep you to monitor status of AAD Sync Scheduler. Recently we had a requirement to monitor AAD Sync Scheduler status, where we had 2 servers. Primary and secondary. On both primary and secondary servers SyncCycleEnabled is set to True. On primary IsStagingModeEnabled is set to False and on secondary the same is set to True. AAD Sync Scheduler helps in synchronising changes happening in your on-premises directory to Azure AD. To view current configuration settings and status you can use Get-ADSyncScheduler .

Primary server:
SyncCycleEnabled - True
IsStagingModeEnabled  - False

Secondary server:
SyncCycleEnabled - True
IsStagingModeEnabled  - True

Any changes to the above states will have to trigger a warning message. The server in staging mode (secondary server in our case) is active for import and synchronisation, but it doesn't run any exports. A server in staging mode (secondary) continues to receive changes from Active Directory and Azure AD. It always has a copy of the latest changes and can be failed over to become primary. If you make configuration changes to your primary server, it is your responsibility to make the same changes to the server in staging mode. SyncCycleEnabled : True, indicates that the scheduler is running the import, sync and export processes as part of its operation. So its important to monitor the status of it at regular intervals.  

Script

$localcyclestatus = Get-ADSyncScheduler | select -ExpandProperty SyncCycleEnabled
$localstagingstatus = Get-ADSyncScheduler | select -ExpandProperty StagingModeEnabled

if($localcyclestatus -eq $True -and $localstagingstatus -eq $False)
{
Send-MailMessage -SmtpServer "smtp01.xyz.com" -Subject "AAD Sync Scheduler Operating Normally on Primary Server" -Body "AAD Sync Scheduler Operating Normally on Primary Server" -From "primary@xyz.com" -To "alert@xyz.com"
}

else
{
Send-MailMessage -SmtpServer "smtp01.xyz.com" -Subject “AAD Sync Scheduler Warning on Primary Server” -Body "AAD Sync Scheduler Warning on Primary Server" -From "primary@xyz.com" -To "alert@xyz.com"
}


Similarly, to monitor the secondary server you just have to edit the if check statement. That is, SyncCycleEnabled and StagingModeEnabled will be True. If not trigger a warning email. You can run the script locally, or can execute it remotely using "invoke-command". For monitoring the Sync status in regular intervals you can configure it as a Windows schedule task.

Reference: Microsoft

How Windows Logon works

This post will give you a brief idea about how interactive logon works in Windows. Logon process is the first step in user authentication and authorisation. Following are the main components of interactive logon architecture.

-Winlogon
-GINA DLL (Graphical Identification and Authentication Dynamic Link Library)
-LSA (Local Security Authority)
-Authentication packages (NTLM and Kerberos)

Local logon and domain logon process are explained below.

Local logon:



Domain logon:




Note: Diagrams used are from technet article

Tuesday, August 9, 2016

Hyper-V VM deployment using powershell and VHDX templates

Following powershell script can be used to deploy virtual machines on a Hyper-V host.

CODE:

#Start
#VM name
[string]$vmname = Read-Host "Name of VM"
$vmcheck = Get-VM -name $vmname

#To check for duplicate VM on the host
if(!$vmcheck)
{
Write-Host "Above warning can be ignroed as there is no duplicate VM. Please proceed and enter following details. `n"
[int32]$gen = Read-Host "Generation type"
[int32]$cpu = Read-Host "Number of vCPU"
[string]$vmpath = Read-Host "Enter path for VM configuration files (Eg: E:\VM)"
[string]$dynamic = $null

while("yes","no" -notcontains $dynamic)
{
$dynamic = Read-Host "Will this VM use dynamic memory? (yes/no)"
}

#Dynamic memory parameters
if($dynamic -eq "yes")
{
[int64]$minRAM = Read-Host "Minimum memory (MB)"
[int64]$maxRAM = Read-Host "Maximum memory (MB)"
[int64]$startRAM = Read-Host "Starting memory (MB) [Note: Specify value between $minRAM and $maxRAM]"
$minRAM = 1MB*$minRAM
$maxRAM = 1MB*$maxRAM
$startRAM = 1MB*$startRAM

#Creating the VM with dynamic RAM
New-VM -Name $vmname -Path $vmpath -Generation $gen
Set-VM -Name $vmname -DynamicMemory -MemoryStartupBytes $startRAM -MemoryMinimumBytes $minRAM -MemoryMaximumBytes $maxRAM
}

else
{
#Creating the VM with static RAM
[int64]$staticRAM = Read-Host "Static memory (MB)"
$staticRAM = 1MB*$staticRAM
New-VM -Name $vmname -Path $vmpath -Generation $gen -MemoryStartupBytes $staticRAM
}

#Setting VM auto start to none and auto stop to shutdown
Set-VM -Name $vmname -ProcessorCount $cpu -AutomaticStartAction Nothing -AutomaticStopAction ShutDown

#Creating VM hard disk directory
New-Item -path $vmpath\$vmname -name "Virtual Hard Disks" -type directory

#Enabling processor compatibility configuration for migration
Set-VMProcessor $vmname -CompatibilityForMigrationEnabled $true
}#vmcheck ends here

else
{
Write-Host "A VM named $vmname already exists"
}
#End

Now the VM is created. But it doesn't have virtual hard disk (VHDX file). Assuming that you already have a syspreped VHDX template. Copy that VHDX template to the virtual hard disk folder of the VM that you just created. Rename it as per your standard. Now attach the disk to SCSI controller if Gen 2 or to IDE controller if Gen 1. Change the boot order and select hard drive as first boot entry. Connect the NIC to vSwitch. Now you can start your VM.


Reference:

techthoughts
starwindsoftware

Sunday, July 10, 2016

What happens when you enable Intel VT

Lets consider the difference between virtualized and non-virtualized platforms.


Here VMM refers to Hypervisor. There are different privilege levels in the processor for instruction execution. These levels are called Rings (Ring 0, 1, 2, 3).

When you enable Intel VT:
  • In a non-virtualized environment OS runs on ring 0. A single operating system controls all hardware resources
  • Four privilege levels (rings) are employed on VT platforms
  • When it is enabled hypervisor now runs on Ring 0 instead of an OS. Guest OS runs in Ring 1 or Ring 3
  • VT allows the hypervisor to present each guest OS a virtual machine (VM) environment that emulates the hardware environment needed by the guest OS

When you enable Intel VT-x:
  • Intel (VT-x) - is a hardware assisted virtualization technology
  • Hardware support for processor virtualization enables system vendors to provide simple, robust, and reliable hypervisor software
  • VT-x consists of a set of virtual machine extensions (VMX) that support virtualization of processor hardware for multiple software environments using virtual machine
  • A hypervisor written to take advantage of the Intel®Virtualization Technology runs in a new CPU mode called “VMX Root” mode and the guest OS in the “VMX Non-root” mode. The VMM will manage the virtual machines through the VM Exit and VM Entry mechanism
  • Hypervisor has its own privileged level (VMX Root) where it executes

Below figure shows difference in Ring levels of Intel VT and Intel VT-x


Reference: Intel

Saturday, July 9, 2016

Anatomy of Hyper-V cluster debug log

  • Get-ClusterLog dumps the events to a text file
  • Location: C:\Windows\Clsuter\Reports\Cluster.log
  • It captures last 72 hours log
  • Cluster log is in GMT (because of geographically spanned multi-site clusters)
  • Usage: Get-ClusterLog -timespan (which gives last "x" minutes logs)
  • You can also set the levels of logs
  • Set-ClusterLog -Level 3 (level 3 is default)
  • It can be from level 0 to level 5 (increasing level of logging has performance impact)
  • Level 5 will provide the highest level of detail
  • Log format:
    [ProcessID] [ThreadID] [Date/Time] [INFO/WARN/ERR/DBG] [RescouceType] [ResourceName] [Description]

Troubleshooting Live Migration issues on Hyper-V

  1. Check whether enough resources (CPU, RAM) are available at the destination host
  2. Make sure all nodes in the cluster follow same naming standard for vSwitches
  3. Check NUMA spanning is enabled or not. If NUMA spanning is disabled, VM must fit entirely within a single physical NUMA node or the VM will not start or be restored or migrated
  4. Constrained delegation should be configured for all servers in the cluster if you are using Kerberos authentication protocol for live migration
  5. Check live migration setting is enabled on Hyper-V settings
  6. Verify Hyper-V-High-Availability logs in event viewer
  7. Finally check cluster debug log (Get-Clusterlog -timespan) in C:\Windows\Cluster\Reports\Cluster.log 

How Live Migration works on Hyper-V

1.Live migration setup

  • Source host  creates TCP connection with destination host
  • VM configuration data is transferred to destination host
  • A skeleton VM is setup at destination host
  • Physical memory is allocated to that VM
2.Memory pages are transferred from the source to destination host

  • In-state memory (working set) of the VM will be transferred first
  • Default page size is 4 KB
  • All utilized pages will be copied to destination
  • Modified pages are tracked by source and marked as being modified
  • Several iteration of copy process will take place
3.Remaining modified pages will be transferred to destination host

  • VM is then registered and the device state is transferred
  • Less modified pages implies fast migration
  • Total working set is copied to destination
4.Move storage handle from source to destination

  • Till this step VM at destination host is not online
5.Once control of storage is transferred, VM will be online and resumed at destination

  • Now the VM is completely migrated and running on destination host
6.Network cleanup

  • Message is sent to physical network switch causes it to relearn MAC address of migrated VM 

Wednesday, May 11, 2016

Nutanix Certifications

For all those who are enthusiastic to learn the Nutanix Administration course and to be a certified NPP (Nutanix Platform Professional), I strongly recommend to visit their education portal http://nuschool.nutanix.com and enroll yourself. I've completed the course and cleared NPP Certification Exam 4.5 last Monday. Its a decent self paced course which may take around 8-10 hours.

The exam has 50 objective questions and you need a minimum of 80% to pass. If you are not successful at the first attempt, don't worry, you have two more chances. In my first attempt I was able to score 3523/ 5000 only. And on my second attempt I scored 4500/ 5000. 

Once you pass the NPP, you can apply for the next level of certification which is NSS (Nutanix Support Specialist) and then to the ultimate level NPX (Nutanix Platform Expert).

Tuesday, March 29, 2016

Hyper-V on Nutanix

This article explains briefly about Hyper-V on Nutanix virtual computing platform. The below figure shows a 'N' node Nutanix architecture, where each node is an independent server unit with a hypervisor, processor, memory and local storage (combination of SSD and HDD). Along with this there is a CVM (controller VM) through which storage resources are accessed.

'N' node Hyper-V over Nutanix architecture
Local storage from all the nodes are combined together to form a virtualized and unified storage pool by the Nutanix Distributed File System (NDFS). This can be considered as an advanced NAS which delivers a unified storage pool across the cluster and having features like striping, replication, error detection, failover, automatic recovery etc. From this pool shared storage resources are presented to the hypervisors using containers.


NDFS - Logical view of storage pool and containers


As mentioned above, each Hyper-V node has its own CVM which is shown below.
Hyper-V node with a CVM

PRISM console

View of storage pool in PRISM
View of containers in PRISM
Containers are mounted to Hyper-V as SMB 3.0 based file shares where virtual machine files are stored. This is shown below.

SMB 3.0 share path to store VHDs and VM configuration files

Once share path is given properly as mentioned above, you can create virtual machines on your Hyper-V server.

Tuesday, January 26, 2016

Zoning and LUN masking

Zoning and LUN masking are used to isolate SAN traffic and to restrict access to storage devices. For example you might manage different zones separately for testing and production environment, so that they will not interfere. If you want to restrict certain hosts from accessing the storage devices then you have to setup zoning. This is generally done at FC switch level. Zoning are of two types : soft zoning and hard zoning.

Soft zoning is based on WWN name of the device and hard zoning is configured at FC switch port level. Soft zoning offers a greater range of flexibility. That means even if you move a device from one port to another on the FC switch, it will have the same access rights as the restriction is based on WWN name of the device. But the down side of this is using WWN spoofing you can gain access to zones that you aren't supposed to see. In case of hard zoning at switch port level, you will get a tighter access control but with less flexibility compared to soft zoning. Now if you change the device from one port to another as we done before, it won't be able to see its partner. In this case you can't spoof a physical port unless you are standing in the same room at the switch.

Once zoning is done you can further restrict access to SAN LUNs by using LUN masking. This will prevent certain devices from seeing specific LUNs hosted in the storage device. LUN masking is done at storage controller level or OS level of the storage device. It is recommended to use zoning and LUN masking together for securing storage traffic.