Monday 21 April 2014

Weird 2960 ARP issue

I've been doing a load of migrations lately which involve moving circuits from one head end PE to another and I've consistently run into a problem with 2960 switches running the LAN base image. Basically, following the moves I consistently find that although everything attached to the switch is still reachable, I lose management  connectivity to the on-site switch at the far end of the circuit. Strangely it is always possible to ping / SSH to the switch from the connected interface but not from the management station.

The issue seems to be down to some really weird behaviour with the ARP table of the 2960. LAN base is a layer 2 only image so the box can only have one SVI active and relies on a default gateway to get to any other networks - nothing new there. The weird thing is that for some reason when the 2960 wants to send traffic to a remote network, for example responding to a ping from a management station, it creates an ARP entry in its table for the remote IP with the gateway's MAC.

Remote-Sw-01#show ip arp
Protocol  Address          Age (min)  Hardware Addr   Type   Interface
Internet  192.168.50.50         167   00c0.321a.be00  ARPA   Vlan120
Internet  10.123.145.193          0   189c.5dfe.be1f  ARPA   Vlan120
Internet  10.123.145.194          -   3037.ade1.a4b4  ARPA   Vlan120

This seems to happen irrespective of whether proxy arp is enabled on the upstream interface, plus in any case the switch should not be ARPing for anything outside its subnet, so I seriously doubt that the entry is being built by any genuine ARP transaction. Looks like a bodge to me :)

Once these spurious non-adjacent ARP entries are in place they do not seem to get overwritten by, for example, receiving traffic from a given IP with a different MAC. Fortunately, legitimate entries for the local subnet do get overwritten, which leaves the door slightly ajar.

I can't see any way to stop the annoying behaviour, so the obvious workaround is to SSH in from the connected interface (check your ACLs!) or and blow any entries still referring to the old gateway MAC out of the ARP table.

Remote-Sw-01#clear ip arp 192.168.50.50
Remote-Sw-01#show ip arp
Protocol  Address          Age (min)  Hardware Addr   Type   Interface
Internet  192.168.50.50           0   189c.5dfe.be1f  ARPA   Vlan120
Internet  10.123.145.193          1   189c.5dfe.be1f  ARPA   Vlan120
Internet  10.123.145.194          -   3037.ade1.a4b4  ARPA   Vlan120

 At that point the correct gateway MAC will be learned and connectivity should instantly be restored. Another alternative is to SSH from a second management station which hasn't connected recently enough to have an ARP entry. Of course, if you have 4 hours you could just wait for the ARP entry to expire.

No comments:

Post a Comment