vSwitch and failback, word of warning
Enabled failback setting is by default on for vSwitch Network teaming and in combination with the default failover detection of “Link Status” failover, switch and switch port failures can cause Host isolation.
I have seen these default settings trip up alot of people and thought it best to spell it out, use portfast on all VMware infrastructure switch interfaces or turn off failback/don’t use Link Status.
The issue occurs where an interface on the vSwitch with the Management port group goes down. With Active/Passive the Mgmt Interface will flip to the other interface (assumng the primary fails), this will occur immediately and there is no issue.
When the failed interface comes back up without port fast, failback will flip the port groups traffic back to the that interface immediately (detection of Link Status is that only, when the interface is enabled). The Switch will put the port in to listening and learning modes first, learning MAC addresses and watching for spanning tree loops. Therefore the traffic from the mgmt interface is not forwarded. How long this lasts depends on the switch configuration “the forward delay” but it will always last at least 15 seconds which means with the default host isolation timeout will be triggered.
The easy option is to disable failback but is it better to use port fast to ensure active interfaces in a vSwitch are always used if they are available.