Troubleshooting when a switch crashes and reboots

Cause

Although the switch software is highly reliable, a switch in the stack can experience a software issue that results in the crash and reboot of that switch. This crash can happen in the software running on the CPU in the management CPU or on the software running on the CPUs in the interfaces. In either case, crash information is generated and the switch is rebooted.

The resiliency of the stack is determined by the stacking topology, however, in all cases, the interfaces/ports on the switch that crashes are brought down and a reboot of that switch occurs.

The following table describes how the stack reacts to the crashing switch, depending on what role the switch had when the crash occurred. The assumption in this table is that the topology is a resilient topology (that is, a mesh or ring).

Stacking role

Description

Commander

  • The standby takes over as the new Commander

  • A new standby is elected

  • Crashing switch writes core file to local stable storage

  • Crashing switch reboots and joins the stack

  • Core file and crash information for this switch is available from the Commander

Standby

  • A new standby is elected

  • Crashing switch writes core file to local stable storage

  • Crashing switch reboots and joins the stack

  • Core file and crash information for this switch is available from the Commander

Member

  • Crashing switch writes core file to local stable storage

  • Crashing switch reboots and joins the stack

  • Core file and crash information for this switch is available from the Commander

After a switch crashes, you can collect data to help understand why the crash occurred. The information is a combination of crash data, crash log, and core-dump files. The show tech command displays logs of events that happened right before the crash.