Last week i was at the VMware Tech Summit in Cork, Ireland. I attended a session about NSX-T troubleshooting. During this session a lot of issues came to the stage which i dealt with in the last year or so.
One of the is issues was about the IP Bindings of a VM to a segment. In our case a tenant manually edited the Ipv4 address on the network Interface of the VM, and after this the connection to this VM dropped.
DFW checked, routing checked al was ok. After some digging we found out that the VM had 2 IP addresses in the realized bindings section on the logical switch. This view can be found in the manager UI of the segment port.
How can you find these Realized bindings and Fix it?
To get to the right port you can do the following, Go to segments and look for the segment to which the VM is connected. Once you found it click on the number you see beneath the ports.
This opens a window where you can find all connected port to the segment, copy the Segment Port Name.
Now search in the Search Bar for this Segment Port Name, and click the one with Resource Type Logical Ports.
This takes you to the Manager UI of this Logical port. You can always go through Networking -> then Switch to Manager UI in the upper right corner -> Select logical switches -> Search through the list for the right port.
Select Address Bindings, here you can see the Auto Discovered Bindings, both with the current IP from the VM. One learned from VMware Tools and the other by ARP Snooping. But if you take a close look at the Realized Bindings you can see a different IP learned by ARP Snooping. This was the original IP the Vm had when it connected.
This can cause connection problems! In our case the whole routing was messed up and the traffic went out via the wrong Uplink.
We can fix this quickly with moving the entry with the old IP address to the Ignore Bindings:
It will take a few seconds to updated the realized Bindings with the new lP address learned by ARP Snooping.
After this the connection came up and the tenant was happy!
But this was nothing more than a quickfix, what if all tenants gone mad and they are manually changing their IP addresses in the OS…….
So why is the IP address staying in the Realized Bindings section and keeps bringing carnage?
By default, the discovery methods ARP snooping and ND snooping operate in a mode called trust on first use (TOFU). In TOFU mode, when an address is discovered and added to the realized bindings list, that binding remains in the realized list forever.
Can we modify that mode? Yes we can!
In NSX-T we use several profiles, one of those is the IP Discovery profile. This profile can be found in the Policy UI under Segments -> Segment Profiles
Create a new Ip Discovery Profile and disable the TOFU setting, When you do this, TOFU changes to Trust On Every Use (TOEU). In this TOEU mode, the discovered IP addresses are placed in the binding list and deleted when they expire. DHCP snooping and VMware tools always operate in TOEU mode.
Now we need to adjust the segment to use the new IP Discovery Profile, go to the segment and click edit.
Under Segment profiles select the new TOFU Profile, click Save and the Close Editing.
Now when a tenant changes the IP of the Network Interface Manually the old IP learned the first time by ARP Snooping is not present anymore in the Realize Bindings section.