Saturday, 4 July 2015

OSPF stuck in EXCHANGE / EXSTART

One problem that occasionally comes up in network troubleshooting, mainly in carrier type environments, is a situation where OSPF refuses to come up to a FULL state and instead just sits in the EXCHANGE state at one end and the EXSTART state at the other. To be fair it's one of those things you've either seen or you haven't, but it's something every network engineer should know.

TL;DR - If you don't care why and just want to fix it: it's ALWAYS an MTU mismatch!

For those who are interested, I'll explain what's happening after a quick review of the OSPF neighbour establishment process. Here's a prettified version of the state table from RFC 2328:



Up until the EXSTART state, all the packets are small and no MTU information is shared so everything works fine. Now, to move from the from the EXSTART state into the EXCHANGE state the two devices must agree on who is master. This is done by each device sending an empty database descriptor (DBD) packet to the other - the devices check each other's DBDs and the device with the highest router ID becomes master.

The problem here is that DBDs contain MTU information and if a DBD is received with a higher MTU than the interface on which it arrived, the DBD is silently dropped as per RFC 2328:

"If the Interface MTU field in the Database Description packet indicates an IP datagram size that is larger than the router can accept on the receiving interface without fragmentation, the Database Description packet is rejected."

So, the device with the larger interface MTU receives a DBD, sends its DBD and is happy enough to move into the EXCHANGE state. The device with the smaller MTU has sent its DBD but has effectively not received one in return so it remains in EXSTART. No matter how many times the DBD with the larger MTU is retransmitted it will never be accepted. Eventually the state times out and we go back to the beginning.

Papering Over the Cracks


In Cisco IOS it is possible to configure ip ospf mtu-ignore under the interface, which drops the MTU check for that interface. This might seem like a good idea, however I wouldn't recommend it. Of course the best practice is to make sure MTUs are consistent across your network, there's not really an excuse to have MTU mismatched across a link! While ignoring the MTU might get the link up, you are storing up problems for later. Aside from the obvious data plane issues (black-holing large packets in one direction) you may also break the control plane.

For example, you could have a configuration that has been in place for months without change and has "always worked" but suddenly, following a link flap, is now stuck in EXCHANGE / EXSTART. Initially when you connect the devices up, the odds are that the LSDB will be small. At that point, the mismatched MTU will not cause problems and the neighbour will establish fine. Later on in life, though, the LSDBs will be full and the DBDs larger, until the device with the larger MTU has a big enough LSDB update to fill an over-sized packet which its partner can't handle. Then the state gets all screwed up and neighbours reset... bad times!

Debugging


If you're stuck in EXCHANGE / EXSTART but you're still not convinced it's MTU (or if you're trying to inter-op and your two devices use different conventions to define MTU) you can use debugs to confirm what's going on.

The key one here for Cisco IOS is "debug ip ospf adj", which produces output as shown below:

*Jul 4 22:04:23.979: OSPF-1 ADJ Fa0/0: 2 Way Communication to 10.4.4.254, state 2WAY
*Jul 4 22:04:23.983: OSPF-1 ADJ Fa0/0: Nbr 10.4.4.254: Prepare dbase exchange
*Jul 4 22:04:23.983: OSPF-1 ADJ Fa0/0: Send DBD to 10.4.4.254 seq 0x1D43 opt 0x52 flag 0x7 len 32
*Jul 4 22:04:24.011: OSPF-1 ADJ Fa0/0: Rcv DBD from 10.4.4.254 seq 0xA0CF5CC opt 0x52 flag 0x7 len 32 mtu 1500 state EXSTART
*Jul 4 22:04:24.011: OSPF-1 ADJ Fa0/0: Nbr 10.4.4.254 has larger interface MTU
*Jul 4 22:04:28.435: OSPF-1 ADJ Fa0/0: Rcv DBD from 10.4.4.254 seq 0xA0CF5CC opt 0x52 flag 0x7 len 32 mtu 1500 state EXSTART
*Jul 4 22:04:28.439: OSPF-1 ADJ Fa0/0: Nbr 10.4.4.254 has larger interface MTU
*Jul 4 22:04:28.623: OSPF-1 ADJ Fa0/0: Send DBD to 10.4.4.254 seq 0x1D43 opt 0x52 flag 0x7 len 32
*Jul 4 22:04:28.623: OSPF-1 ADJ Fa0/0: Retransmitting DBD to 10.4.4.254 [1]
[...]
*Jul 4 22:06:27.955: OSPF-1 ADJ Fa0/0: Rcv DBD from 10.4.4.254 seq 0xA0CF5CC opt 0x52 flag 0x7 len 32 mtu 1500 state EXSTART
*Jul 4 22:06:27.955: OSPF-1 ADJ Fa0/0: Nbr 10.4.4.254 has larger interface MTU
*Jul 4 22:06:28.147: OSPF-1 ADJ Fa0/0: Killing nbr 10.4.4.254 due to excessive (25) retransmissions
*Jul 4 22:06:28.147: OSPF-1 ADJ Fa0/0: 10.4.4.254 address 10.4.4.254 is dead, state DOWN
*Jul 4 22:06:28.151: %OSPF-5-ADJCHG: Process 1, Nbr 10.4.4.254 on FastEthernet0/0 from EXSTART to DOWN, Neighbor Down: Too many retransmissions
*Jul 4 22:06:28.151: OSPF-1 ADJ Fa0/0: Nbr 10.4.4.254: Clean-up dbase exchange
*Jul 4 22:06:32.555: OSPF-1 ADJ Fa0/0: Nbr 10.4.4.254 10.4.4.254 is currently ignored


On IOS-XR we have "debug ospf instance-id adj", which returns the same output.

On Juniper JunOS we can configure "set protocols ospf traceoptions flag database-description" which produces the output below:


Jul 4 22:04:41.813313 OSPF rcvd DbD 10.4.4.99 -> 10.4.4.254 (vlan.0 IFL 69 area 0.0.0.0)
Jul 4 22:04:41.814387 Version 2, length 32, ID 10.4.4.99, area 0.0.0.0
Jul 4 22:04:41.814466 checksum 0x0, authtype 0
Jul 4 22:04:41.814566 options 0x52, i 1, m 1, ms 1, r 0, seq 0x1d43, mtu 1492
Jul 4 22:04:41.815182 RPD_OSPF_NBRUP: OSPF neighbor 10.4.4.99 (realm ospf-v2 vlan.0 area 0.0.0.0) state changed from Init to ExStart due to 2WayRcvd (event reason: initial DBD packet was received)
Jul 4 22:04:41.815388 1400 Max dbd packet
Jul 4 22:04:41.815763 OSPF sent DbD 10.4.4.254 -> 224.0.0.5 (vlan.0 IFL 69 area 0.0.0.0)
Jul 4 22:04:41.815889 Version 2, length 32, ID 10.4.4.254, area 0.0.0.0
Jul 4 22:04:41.815970 options 0x52, i 1, m 1, ms 1, r 0, seq 0xa0cf5cc, mtu 1500
Jul 4 22:04:46.254104 OSPF resend last DBD to 10.4.4.99
Jul 4 22:04:46.254753 OSPF sent DbD 10.4.4.254 -> 224.0.0.5 (vlan.0 IFL 69 area 0.0.0.0)
Jul 4 22:04:46.254861 Version 2, length 32, ID 10.4.4.254, area 0.0.0.0
Jul 4 22:04:46.254966 options 0x52, i 1, m 1, ms 1, r 0, seq 0xa0cf5cc, mtu 1500
Jul 4 22:04:46.447212 OSPF rcvd DbD 10.4.4.99 -> 10.4.4.254 (vlan.0 IFL 69 area 0.0.0.0)
Jul 4 22:04:46.447359 Version 2, length 32, ID 10.4.4.99, area 0.0.0.0
Jul 4 22:04:46.447439 checksum 0x0, authtype 0
Jul 4 22:04:46.447584 options 0x52, i 1, m 1, ms 1, r 0, seq 0x1d43, mtu 1492
Jul 4 22:04:50.313983 OSPF resend last DBD to 10.4.4.99
[...]
Jul 4 22:06:45.737775 OSPF sent DbD 10.4.4.254 -> 224.0.0.5 (vlan.0 IFL 69 area 0.0.0.0)
Jul 4 22:06:45.737882 Version 2, length 32, ID 10.4.4.254, area 0.0.0.0
Jul 4 22:06:45.738103 options 0x52, i 1, m 1, ms 1, r 0, seq 0xa0cf5cc, mtu 1500
Jul 4 22:06:50.336478 OSPF resend last DBD to 10.4.4.99
Jul 4 22:06:50.337124 OSPF sent DbD 10.4.4.254 -> 224.0.0.5 (vlan.0 IFL 69 area 0.0.0.0)
Jul 4 22:06:50.337291 Version 2, length 32, ID 10.4.4.254, area 0.0.0.0
Jul 4 22:06:50.337414 options 0x52, i 1, m 1, ms 1, r 0, seq 0xa0cf5cc, mtu 1500
Jul 4 22:06:54.868260 RPD_OSPF_NBRDOWN: OSPF neighbor 10.4.4.99 (realm ospf-v2 vlan.0 area 0.0.0.0) state changed from ExStart to Init due to 1WayRcvd (event reason: neighbor is in one-way mode)

References

RFC2328


No comments:

Post a Comment