Redundancy everywhere
At least 2 physical NICs per traffic type (teamed or separate) where possible.
Use dual ToR switches (stack/VPC/MLAG) so every host has uplinks to both.
Avoid single points of failure: no single NIC, single switch, or single VLAN for critical paths.
Traffic separation
At minimum, logically separate:
Management
VM/tenant traffic
Cluster/heartbeat + CSV
Live Migration
Storage (iSCSI or SMB / S2D east-west)
Use dedicated VLANs and where possible dedicated NICs or vNICs (with QoS).
Consistent, deterministic configuration
Same number of NICs, same names, same vSwitch name, same VLANs, same QoS on all nodes.
Standardize IP schemas by function (e.g., 10.10.1.x mgmt, 10.10.2.x LM, 10.10.3.x cluster, etc.).
Throughput > latency for VM & LM, latency > throughput for heartbeat
Heartbeat/cluster doesn’t need huge bandwidth, but must be stable and low-latency.
Live Migration and storage need high bandwidth, low packet loss.
Minimum practical pattern (per host):
Mgmt / Host OS
2 x 1/10/25 GbE (teamed) – VLAN for management + maybe BMC on separate OOB network.
VM / Tenant Traffic
2 x 10/25 GbE → vSwitch (SET or LBFO if older) for VMs.
Cluster / CSV / S2D / SMB
2 x 10/25 GbE (RDMA capable strongly recommended: RoCEv2 or iWARP).
Live Migration
Either:
Shared with CSV/S2D (high-bandwidth RDMA), but separate VLAN & QoS or
Own pair of NICs or vNICs on the main vSwitch.
On smaller hosts, you might combine some roles but never put everything on one adapter team without QoS and VLAN separation.
Use SET (Switch Embedded Teaming) on 2016+
For Hyper-V clusters on 2016/2019/2022+, prefer SET over classic LBFO teaming for VM traffic.
Create one SET vSwitch per host using 2+ physical NICs:
New-VMSwitch -Name "vSwitch-Prod" -NetAdapterName "NIC1","NIC2" -EnableEmbeddedTeaming $true -AllowManagementOS $false
Then add vNICs for Mgmt, LM, CSV, etc. on top of SET if using converged networking.
Converged networking vNICs
On hosts, create vNICs attached to the same vSwitch:
vNIC-Mgmt
vNIC-LiveMigration
vNIC-Cluster/CSV
vNIC-Backup (if needed)
Bind each vNIC to its own VLAN and QoS weight (see QoS section).
Avoid mixing “AllowManagementOS = $true” with converged design
Prefer no management OS sharing on the main vSwitch:
Use host vNICs (created via Add-VMNetworkAdapter -ManagementOS) instead of binding the OS directly to the pNIC.
For older OS (2012 R2)
Use LBFO NIC Teaming with Dynamic load distribution and Switch Independent mode.
Still keep the same concept: single team feeding a vSwitch, then vNICs on top.
Create dedicated VLANs per traffic type
Suggested layout:
VLAN 10 – Management
VLAN 20 – VM/Production
VLAN 30 – Cluster/Heartbeat
VLAN 40 – CSV/S2D/Storage
VLAN 50 – Live Migration
VLAN 60+ – Backup / Replication / DMZ segments as needed
IP schema example
Mgmt: 10.10.10.0/24
Cluster/Heartbeat: 10.10.20.0/24
CSV/S2D/SMB: 10.10.30.0/24
Live Migration: 10.10.40.0/24
iSCSI: 10.10.50.0/24 (non-routed)
Routing rules
Cluster/CSV/S2D and iSCSI subnets are usually non-routed (east-west only).
Enable routing only where necessary and secure with ACLs/firewall to reduce blast radius.
Use Hyper-V / SMB QoS with converged networking
Assign minimum bandwidth weights to each vNIC.
Example weight scheme (total = 100):
VM traffic: 50
CSV/S2D/SMB: 25
Live Migration: 15
Management: 10
PowerShell example:
Set-VMNetworkAdapter -ManagementOS -Name "vNIC-VM" -MinimumBandwidthWeight 50
Set-VMNetworkAdapter -ManagementOS -Name "vNIC-S2D" -MinimumBandwidthWeight 25
Set-VMNetworkAdapter -ManagementOS -Name "vNIC-LiveMigration" -MinimumBandwidthWeight 15
Set-VMNetworkAdapter -ManagementOS -Name "vNIC-Mgmt" -MinimumBandwidthWeight 10
SMB Multichannel + SMB Direct (RDMA) for S2D / CSV / Live Migration
Ensure multiple NICs/RDMA adapters can be used concurrently.
Keep SMB traffic on dedicated subnets/VLANs.
Don’t throttle cluster heartbeat too much
Heartbeat is low-bandwidth but time-sensitive.
Ensure it has enough bandwidth and low latency; never run it only over saturated pathways.
iSCSI SAN
Use dedicated NICs for iSCSI only.
No default gateway on iSCSI NICs; static routes if needed.
Enable Jumbo Frames (MTU 9000) end-to-end if supported (hosts, switches, storage).
Use MPIO with at least two paths per host.
SMB 3.x / Storage Spaces Direct
Use RDMA-capable NICs (RoCEv2 or iWARP) with:
PFC/ETS properly configured if RoCE.
At least 2 x 10/25 GbE RDMA NICs per host dedicated to S2D/CSV and LM.
Separate Cluster/S2D VLANs from production.
Avoid mixing storage + noisy VM traffic on the same physical NICs without strong QoS and capacity.
Dedicated or converged vNIC
Place Live Migration on its own subnet & VLAN.
Configure LM settings to use that network only (Failover Cluster Manager or PowerShell).
Compression vs. SMB vs. RDMA
For RDMA NICs: use SMB (RDMA).
If no RDMA: Compression is often faster than TCP in most environments.
Throttle LM concurrency
Tune number of simultaneous migrations + bandwidth limit so LM doesn’t starve CSV or VM traffic.
At least two distinct networks
Cluster will use any enabled network, but:
Mark at least one network as “Cluster use only.”
Avoid relying solely on the management network for heartbeats.
Cluster network order
Set network metric / “Role” so:
Storage/CSV network is preferred for CSV traffic.
Management network is used for client access but not primary for heartbeat.
Name each network clearly
E.g., “ClusterNet-Heartbeat,” “ClusterNet-CSV,” “MgmtNet-Prod” so troubleshooting is easier.
Isolate management and storage from user/VM networks
Use separate VLANs and secure routing.
Only admin jump hosts and monitoring tools should reach management IPs.
Use firewalls
Harden Windows Firewall with cluster + Hyper-V rules.
Close all non-required ports; restrict RDP, WinRM, SMB to admin/workload subnets.
Use secure management protocols
WinRM over HTTPS, SSH (if needed), RDP gateways.
Disable legacy/weak protocols (SMB1, old cipher suites).
Protect virtual switches
Enable DHCP guard, Router guard, Port ACLs as needed on VMs.
Consider MAC spoofing only where required (e.g., NLB, some appliances).
LACP / Static teaming
If using SET with Switch-Dependent mode, configure LACP on the switches.
If Switch Independent: ensure both switches are in same L2 for VLANs, but no LAG required.
Spanning Tree / PortFast
Enable equivalent of PortFast / Edge on server ports to avoid STP delays.
Don’t oversubscribe too heavily; plan uplink capacity vs host aggregate bandwidth.
Consistent switch templates
Same VLANs, trunks, QoS, MTU, ACLs across every ToR switch so any host can plug into any port.
Before putting into production:
Test failover of each NIC (pull cables).
Test switch failure (shut down one ToR).
Validate Live Migration under load.
Validate CSV failover and storage performance.
Run Test-Cluster and fix all networking-related warnings/errors.
Monitoring
Enable monitoring for:
NIC errors/drops
RDMA counters
Live Migration failures
Cluster heartbeat loss events
Use perfmon / SCOM / Azure Monitor / other tools as appropriate.
For each Hyper-V cluster node:
4 x 25GbE NICs:
NIC1+NIC2 → SET vSwitch-Prod
vNIC-VM (VLAN 20)
vNIC-Mgmt (VLAN 10)
vNIC-LM (VLAN 50, QoS weight 15)
vNIC-Cluster (VLAN 30, QoS weight 10)
NIC3+NIC4 → RDMA (no vSwitch)
S2D/CSV/SMB (VLAN 40, dedicated subnets, SMB Multichannel/Direct, QoS weight 25+)
Optional extra 1GbE:
Out-of-band management / iLO / iDRAC on separate management network.