I recently ran into a problem when trying to apply a base build to a Cisco 7600 router with dual supervisors which didn't seem to be documented anywhere, so I thought I'd record the issue and the eventual fix here.
The gist of the problem was that the secondary supervisor would not go from cold standby to hot, so in other words if the active supervisor crashed, the chassis would have to reboot in order to use the standby supervisor. The system was showing the reason for this as software mismatch, even though the two cards had the same image installed:
BUILD#show redundancy states
my state = 13 -ACTIVE
peer state = 4 -STANDBY COLD
Mode = Duplex
Unit = Primary
Unit ID = 5
Redundancy Mode (Operational) = rpr Reason: Software mismatch
Redundancy Mode (Configured) = sso
Redundancy State = rpr
Maintenance Mode = Disabled
Communications = Up
client count = 159
client_notification_TMR = 30000 milliseconds
keep_alive TMR = 9000 milliseconds
keep_alive count = 1
keep_alive threshold = 18
RF debug mask = 0x0
I won't say exactly which image this was, but it was an SSO-capable relase of IOS 15 and the two supervisors were *definitely* running the same code (one was copied from the other). The tale of software incompatibility seemed unlikely.
BUILD#show log
[snip]
*Jan 6 17:21:33.339: %SYS-SP-STDBY-5-RESTART: System restarted --
Cisco IOS Software, c7600s72033_sp Software (c7600s72033_sp-ADVIPSERVICESK9-M), Version 15.x(x)x, RELEASE SOFTWARE (xx)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2012 by Cisco Systems, Inc.
Compiled Mon 00-Jan-00 00:00 by prod_rel_team
*Jan 6 17:22:50.255 GMT: Config Sync: Bulk-sync failure due to Servicing Incompatibility. Please check full list of mismatched commands via:
show redundancy config-sync failures mcl
*Jan 6 17:22:50.255 GMT: Config Sync: Starting lines from MCL file:
-ipv6 mfib hardware-switching replication-mode ingress
*Jan 6 17:22:50.255 GMT: %ISSU-SP-3-INCOMPATIBLE_PEER_UID: Setting image (c7600s72033_sp-ADVIPSERVICESK9-M), version (15.x(x)xx) on peer uid (6) as incompatible
*Jan 6 17:22:50.995 GMT: %RF-SP-5-RF_RELOAD: Peer reload. Reason: ISSU Incompatibility
*Jan 6 17:22:50.995 GMT: %OIR-SP-3-PWRCYCLE: Card in module 6, is being power-cycled (RF request)
*Jan 6 17:22:51.999 GMT: %PFREDUN-SP-6-ACTIVE: Standby processor removed or reloaded, changing to Simplex mode
*Jan 6 17:22:53.195 GMT: %SNMP-5-MODULETRAP: Module 6 [Down] Trap
*Jan 6 17:24:19.791 GMT: %ISSU-SP-3-PEER_IMAGE_INCOMPATIBLE: Peer image (c7600s72033_sp-ADVIPSERVICESK9-M), version (15.x(x)xx) on peer uid (6) is incompatible
*Jan 6 17:24:19.791 GMT: %ISSU-SP-3-PEER_IMAGE_INCOMPATIBLE: Peer image (c7600s72033_sp-ADVIPSERVICESK9-M), version (15.x(x)xx) on peer uid (6) is incompatible
*Jan 6 17:25:53.149 GMT: %PFREDUN-SP-4-INCOMPATIBLE: Defaulting to RPR mode (Runtime incompatible)
*Jan 6 17:25:54.154 GMT: %PFREDUN-SP-6-ACTIVE: Standby initializing for RPR mode
*Jan 6 17:25:58.471 GMT: %SYS-SP-3-LOGGER_FLUSHED: System was paused for 00:00:00 to ensure console debugging output.
*Jan 6 17:25:58.763 GMT: %FABRIC-SP-5-CLEAR_BLOCK: Clear block option is off for the fabric in slot 6.
*Jan 6 17:25:58.859 GMT: %FABRIC-SP-5-FABRIC_MODULE_BACKUP: The Switch Fabric Module in slot 6 became standby
*Jan 6 17:26:00.299 GMT: %SNMP-5-MODULETRAP: Module 6 [Up] Trap
*Jan 6 17:26:00.279 GMT: %DIAG-SP-6-BYPASS: Module 6: Diagnostics is bypassed
*Jan 6 17:26:00.375 GMT: %OIR-SP-6-INSCARD: Card inserted in slot 6, interfaces are now online
*Jan 6 17:26:06.435 GMT: %RF-SP-5-RF_TERMINAL_STATE: Terminal state reached for (RPR)
OK, so clearly it doesn't like the "ipv6 mfib hardware-switching replication-mode ingress" command for some reason. Why it would work on one and not the other is a mystery but hey... I don't have big plans for IPv6 multicast so I don't care what replication mode it's in - let's just delete the offending command:
BUILD#conf t
Enter configuration commands, one per line. End with CNTL/Z.
BUILD(config)#no ipv6 mfib hardware-switching replication-mode ingress
no ipv6 mfib hardware-switching replication-mode ingress
^
% Invalid input detected at '^' marker.
So I can't negate the command, in fact there's no "mfib" stanza under "no ipv6":
BUILD(config)#no ipv6 ?
access-list Configure access lists
[snip]
local Specify local options
mld Global mld commands
[snip]
spd Selective Packet Discard (SPD)
In fact, even the original command seems to be invalid:
BUILD(config)#ipv6 mfib hardware-switching replication-mode ?
% Unrecognized command
And yet here it is in the config from which we booted:
BUILD#show start | inc ipv6
ipv6 unicast-routing
ipv6 mfib hardware-switching replication-mode ingress
no mls flow ipv6
?!?!
I guess it's one of those legacy commands they bodge the CLI to take but you can't see in the help. But it won't take the command anyway :| Eventually I found an equivalent command that it *would* take:
BUILD(config)#no ipv6 multicast hardware-switching replication-mode ingress
Warning: This command will change the replication mode for all address families.
BUILD(config)#do show run | inc ipv6
ipv6 unicast-routing
no mls flow ipv6
BUILD(config)#
At last, the problem config is gone! We're almost there but not quite, the previous failures sit in the active supervisor even if the standby is reloaded so we have to kick it to re-evaluate:
BUILD#show redundancy config-sync failures mcl
Mismatched Command List
-----------------------
-ipv6 mfib hardware-switching replication-mode ingress
BUILD#redundancy config-sync validate mismatched-commands
*Jan 7 08:26:28.600 GMT: CONFIG SYNC: MCL validation succeeded
*Jan 7 08:26:28.600 GMT: %ISSU-SP-3-PEER_IMAGE_REM_FROM_INCOMP_LIST: Peer image (c7600s72033_sp-ADVIPSERVICESK9-M), version (15.x(x)xx) on peer uid (6) being removed from the incompatibility list
BUILD#show redundancy config-sync failures mcl
Mismatched Command List
-----------------------
The list is Empty
BUILD#redundancy reload peer
Reload peer [confirm]
Preparing to reload peer
BUILD#
*Jan 7 08:27:16.096 GMT: RP sending reload request to Standby. User: admin on console, Reason: Admin reload CLI
BUILD#
Eventually...
*Jan 7 08:33:37.532 GMT: %HA_CONFIG_SYNC-6-BULK_CFGSYNC_SUCCEED: Bulk Sync succeeded*Jan 7 08:33:37.552 GMT: %RF-SP-5-RF_TERMINAL_STATE: Terminal state reached for (SSO)
*Jan 7 08:33:36.572 GMT: %PFREDUN-SP-STDBY-6-STANDBY: Ready for SSO mode
BUILD#show redundancy
Redundant System Information :
------------------------------
Available system uptime = 15 hours, 20 minutes
Switchovers system experienced = 0
Standby failures = 3
Last switchover reason = none
Hardware Mode = Duplex
Configured Redundancy Mode = sso
Operating Redundancy Mode = sso
Maintenance Mode = Disabled
Communications = Up
Current Processor Information :
-------------------------------
Active Location = slot 5
Current Software state = ACTIVE
Uptime in current state = 15 hours, 19 minutes
Image Version = Cisco IOS Software, c7600s72033_rp Software (c7600s72033_rp-ADVIPSERVICESK9-M), Version 15.x(x)xx, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2012 by Cisco Systems, Inc.
Compiled Wed 01-Aug-12 20:15 by prod_rel_team
BOOT = sup-bootdisk:/c7600s72033-advipservicesk9-mz.15x-x.xx.bin,1;
CONFIG_FILE =
BOOTLDR =
Configuration register = 0x2102
Peer Processor Information :
----------------------------
Standby Location = slot 6
Current Software state = STANDBY HOT
Uptime in current state = 3 minutes
Image Version = Cisco IOS Software, c7600s72033_rp Software (c7600s72033_rp-ADVIPSERVICESK9-M), Version 15.x(x)xx, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2012 by Cisco Systems, Inc.
Compiled Wed 01-Aug-12 20:15 by prod_rel_team
BOOT = sup-bootdisk:/c7600s72033-advipservicesk9-mz.15x-x.xx.bin,1;
CONFIG_FILE =
BOOTLDR =
Configuration register = 0x2102
BUILD#
Win!
No comments:
Post a Comment