This article provides basic guidance on troubleshooting BGP related issues.
|
Sample diagram showing connectivity between Edge Nodes and TOR switches |
Verify Tier-0 Gateway status on NSX-T
- Status of T0 should be Success.
- Check the interfaces of T0 to identify which all edge nodes are part of it.
- Check the status of Edge Transport Nodes.
- As you can see from the T0 interfaces, Edge01/02/03/04 are part of it and in those edge nodes you should be able to see the SR_TIER0 component. Next step is to login to those Edge nodes that are part of T0 and verify BGP summary.
Verify BGP on all Edge nodes that are part of T0 Gateway
- SSH into the edge node as admin user.
- Look for SERVICE_ROUTER_TIER0.
sc2-01-nsxt04-r08edge02> get logical-router
Logical Router
UUID VRF LR-ID Name Type Ports Neighbors
736a80e3-23f6-5a2d-81d6-bbefb2786666 0 0 TUNNEL 4 22/5000
e6d02207-c51e-4cf8-81a6-44afec5ad277 2 84653 DR-t1-domain-c1034:1de3adfa-0ee DISTRIBUTED_ROUTER_TIER1 5 9/50000
a590f1da-2d79-4749-8153-7b174d23b069 32 85271 DR-t1-domain-c1034:1de3adfa-0ee DISTRIBUTED_ROUTER_TIER1 5 5/50000
758d9736-6781-4b3a-906f-3d1b03f0924d 33 88016 DR-t1-domain-c1034:1de3adfa-0ee DISTRIBUTED_ROUTER_TIER1 4 1/50000
5e7bfe98-0b5e-4620-90b1-204634e99127 37 3 SR-sc2-01-nsxt04-tr SERVICE_ROUTER_TIER0 6 5/50000
- vrf <SERVICE_ROUTER_TIER0 VRF>
- Note: If everything is working fine State should show Estab.
sc2-01-nsxt04-r08edge02> vrf 37
sc2-01-nsxt04-r08edge02(tier0_sr[37])> get bgp neighbor summary
BFD States: NC - Not configured, DC - Disconnected
AD - Admin down, DW - Down, IN - Init, UP - Up
BGP summary information for VRF default for address-family: ipv4Unicast
Router ID: 10.184.248.2 Local AS: 4259971071
Neighbor AS State Up/DownTime BFD InMsgs OutMsgs InPfx OutPfx
10.184.248.239 4259970544 Estab 05w1d22h NC 12641393 12610093 2 568
10.184.248.240 4259970544 Estab 05w1d23h NC 12640337 11580431 2 566
- You should be able to ping to the BGP neighbor IP. If you are unable to ping to neighbor IPs, then there is an issue.
sc2-01-nsxt04-r08edge02(tier0_sr[37])> ping 10.184.248.239
PING 10.184.248.239 (10.184.248.239): 56 data bytes
64 bytes from 10.184.248.239: icmp_seq=0 ttl=255 time=1.788 ms
^C
--- 10.184.248.239 ping statistics ---
2 packets transmitted, 1 packets received, 50.0% packet loss
round-trip min/avg/max/stddev = 1.788/1.788/1.788/0.000 ms
sc2-01-nsxt04-r08edge02(tier0_sr[37])> ping 10.184.248.240
PING 10.184.248.240 (10.184.248.240): 56 data bytes
64 bytes from 10.184.248.240: icmp_seq=0 ttl=255 time=1.925 ms
64 bytes from 10.184.248.240: icmp_seq=1 ttl=255 time=1.251 ms
^C
--- 10.184.248.240 ping statistics ---
3 packets transmitted, 2 packets received, 33.3% packet loss
round-trip min/avg/max/stddev = 1.251/1.588/1.925/0.337 ms
sc2-01-nsxt04-r08edge02> vrf 37
sc2-01-nsxt04-r08edge02(tier0_sr[37])> get interfaces | more
Fri Aug 19 2022 UTC 11:07:18.042
Logical Router
UUID VRF LR-ID Name Type
5e7bfe98-0b5e-4620-90b1-204634e99127 37 3 SR-sc2-01-nsxt04-tr SERVICE_ROUTER_TIER0
Interfaces (IPv6 DAD Status A-DAD_Success, F-DAD_Duplicate, T-DAD_Tentative, U-DAD_Unavailable)
Interface : dd83554d-47c0-5a4e-9fbe-3abb1239a071
Ifuid : 335
Mode : cpu
Port-type : cpu
Enable-mcast : false
Interface : 008b2b15-17d1-4cc8-9d94-d9c4c2d0eb3a
Ifuid : 1000
Name : tr-interconnect-edge02
Fwd-mode : IPV4_AND_IPV6
Internal name : uplink-1000
Mode : lif
Port-type : uplink
IP/Mask : 10.184.248.2/24
MAC : 02:00:70:51:9d:79
VLAN : 1611
Verify BGP on Cisco TOR switches
❯ ssh -o PubkeyAuthentication=no netadmin@sc2-01-r08lswa.xxxxxxxx.com
User Access Verification
(netadmin@sc2-01-r08lswa.xxxxxxxx.com) Password:
Cisco Nexus Operating System (NX-OS) Software
sc2-01-r08lswa# show ip bgp summary
BGP summary information for VRF default, address family IPv4 Unicast
BGP router identifier 10.184.17.248, local AS number 65001.65008
BGP table version is 520374, IPv4 Unicast config peers 10, capable peers 8
5150 network entries and 11372 paths using 2003240 bytes of memory
BGP attribute entries [110/18920], BGP AS path entries [69/1430]
BGP community entries [0/0], BGP clusterlist entries [0/0]
11356 received paths for inbound soft reconfiguration
11356 identical, 0 modified, 0 filtered received paths using 0 bytes
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
10.184.10.14 4 65011.65000
47979514 10570342 520374 0 0 5w1d 4541
10.184.10.78 4 65011.65000
47814555 10601750 520374 0 0 5w1d 4541
10.184.248.1 4 65001.65535
80831 79447 520374 0 0 02:41:51 566
10.184.248.2 4 65001.65535
3215614 3269391 520374 0 0 5w1d 566
10.184.248.3 4 65001.65535
3215776 3269344 520374 0 0 1w3d 566
10.184.248.4 4 65001.65535
3215676 3269383 520374 0 0 13:51:45 566
10.184.248.5 4 65001.65535
3200531 3269384 520374 0 0 5w1d 5
10.184.248.6 4 65001.65535
3197752 3266700 520374 0 0 5w1d 5
sc2-01-r08lswa# show ip arp 10.184.248.2
Flags: * - Adjacencies learnt on non-active FHRP router
+ - Adjacencies synced via CFSoE
# - Adjacencies Throttled for Glean
CP - Added via L2RIB, Control plane Adjacencies
PS - Added via L2RIB, Peer Sync
RO - Re-Originated Peer Sync Entry
D - Static Adjacencies attached to down interface
IP ARP Table
Total number of entries: 1
Address Age MAC Address Interface Flags
10.184.248.2 00:06:12 0200.7051.9d79 Vlan1611
- If you compare this IP and MAC, you can see that its the same of your T0 SR uplink of your edge02 node.
IP/Mask : 10.184.248.2/24
MAC : 02:00:70:51:9d:79
For further troubleshooting you can do packet capture from the edge nodes and ESXi server and analyze them using Wireshark.
Packet capture from Edge node
- Capture packets from the T0 SR uplink interface.
sc2-01-nsxt04-r08edge01(tier0_sr[5])> get interfaces | more
Wed Aug 17 2022 UTC 13:52:48.203
Logical Router
UUID VRF LR-ID Name Type
fb1ad846-8757-4fdf-9cbb-5c22ba772b52 5 2 SR-sc2-01-nsxt04-tr SERVICE_ROUTER_TIER0
Interfaces (IPv6 DAD Status A-DAD_Success, F-DAD_Duplicate, T-DAD_Tentative, U-DAD_Unavailable)
Interface : c8b80ba1-93fc-5c82-a44f-4f4863b6413c
Ifuid : 286
Mode : cpu
Port-type : cpu
Enable-mcast : false
Interface : 4915d978-9c9a-58bc-84e2-cafe5442cba4
Ifuid : 287
Mode : blackhole
Port-type : blackhole
Interface : 899bcf30-83e2-46bb-9be2-8889ec52b354
Ifuid : 833
Name : tr-interconnect-edge01
Fwd-mode : IPV4_AND_IPV6
Internal name : uplink-833
Mode : lif
Port-type : uplink
IP/Mask : 10.184.248.1/24
MAC : 02:00:70:d1:92:b1
VLAN : 1611
Access-VLAN : untagged
LS port : 15b971e9-7caa-43b7-86c1-96ff50453402
Urpf-mode : STRICT_MODE
DAD-mode : LOOSE
RA-mode : SLAAC_DNS_TRHOUGH_RA(M=0, O=0)
Admin : up
Op_state : up
Enable-mcast : False
MTU : 9000
arp_proxy :
- Start a continuous ping from the TOR switches to the edge uplink IP (in this case ping 10.184.248.1 from TOR switches) before starting packet capture.
sc2-01-nsxt04-r08edge01> start capture interface 899bcf30-83e2-46bb-9be2-8889ec52b354 file uplink.pcap
Note:
Find the location of uplink.pcap file on TOR switches and SCP it locally to analyze using Wireshark.
Packet capture from ESXi
- In this example, we are capturing packets of sc2-01-nsxt04-r08edge01 VM from the switchports where its interfaces are connected. sc2-01-nsxt04-r08edge01 VM is running on ESXi node sc2-01-r08esx10.
[root@sc2-01-r08esx10:~] esxcli network vm list | grep edge
18790721 sc2-01-nsxt04-r08edge05 3 , ,
18977245 sc2-01-nsxt04-r08edge01 3 , ,
[root@sc2-01-r08esx10:/tmp] esxcli network vm port list -w 18977245
Port ID: 67109446
vSwitch: sc2-01-vc16-dvs
Portgroup:
DVPort ID: b60a80c0-ecd6-40bd-8d2b-fbd1f06bb172
MAC Address: 02:00:70:33:a9:67
IP Address: 0.0.0.0
Team Uplink: vmnic1
Uplink Port ID: 2214592517
Active Filters:
Port ID: 67109447
vSwitch: sc2-01-vc16-dvs
Portgroup:
DVPort ID: 6e3d8057-fc23-4180-b0ba-bed90381f0bf
MAC Address: 02:00:70:d1:92:b1
IP Address: 0.0.0.0
Team Uplink: vmnic1
Uplink Port ID: 2214592517
Active Filters:
Port ID: 67109448
vSwitch: sc2-01-vc16-dvs
Portgroup:
DVPort ID: c531df19-294d-4079-b39c-89a3b58e30ad
MAC Address: 02:00:70:30:c7:01
IP Address: 0.0.0.0
Team Uplink: vmnic0
Uplink Port ID: 2214592519
Active Filters:
- Start a continuous ping from the TOR switches to the edge uplink IP (in this case ping 10.184.248.1 from TOR switches) before starting packet capture.
[root@sc2-01-r08esx10:/tmp] pktcap-uw --switchport 67109446 --dir 2 -o /tmp/67109446-02:00:70:33:a9:67.pcap --count 1000 & pktcap-uw --switchport 67109447 --dir 2 -o /tmp/67109447-02:00:70:d1:92:b1.pcap --count 1000 & pktcap-uw --switchport 67109448 --dir 2 -o /tmp/67109448-02:00:70:30:c7:01.pcap --count 1000
Note:
SCP the pcap files to laptop and use Wireshark to analyse them.
You can also do packet capture from physical uplinks (vmnic) of the ESXi node if required.
Hope it was useful. Cheers!