Troubleshooting stacking

Troubleshooting OOBM and split stack issues

If all the OOBM ports in the stack are in the same VLAN, you can use the show oobm commands to view the current state of all the switches. For example, if you have a five-member chain and member 4 fails or has the power removed, a stack split will occur with an active fragment on members 1-2-3 and an inactive fragment on member 5.

There is one IP address for the active fragment. This can be statically set by assigning an IP address to the global OOBM port.

If the stack splits, you can connect to the Active Fragment using the global OOBM IP address and then enter the show oobm discovery command to see if this active fragment has discovered any other members that are connected using the OOBM LAN.

In the following five member chain example, connect using the global IP address of 10.0.11.49. Once logged on, enter the show stacking command.

Viewing stacking member status

HP Stack 3800#: show stacking

Stack  ID          : 00011cc1-de4d87c0

MAC  Address       : 1cc1de -4d87e5
Stack  Topology    : Chain
Stack  Status      : Fragment
Active Uptime      : 0d 0h 5m
Software  Version  : KA.15.03.0000x

Mbr
ID  Mac Address    Model                                 Pri  Status
--- ------------- -------------------------------------- --- ---------------
1   1cc1de-4d87c0  HP J9573A 3800-24G-PoE+-2SFP+ Switch  128  Commander
2   1cc1de-4dc740  HP J9573A 3800-24G-PoE+-2SFP+ Switch  128  Member
3   1cc1de-4dbd40  HP J9575A 3800-24G-2SFP+ Switch       128  Standby
4   1cc1de-4d79c0  HP J9576A 3800-48G-4SFP+ Switch       128  Missing
5   1cc1de-4da900  HP J9576A 3800-48G-4SFP+ Switch       200  Missing

Enter show oobm discovery to see if the members have been discovered using OOBM.

Viewing oobm discovery

HP Stack 3800#:  show oobm discovery

Active Stack Fragment(discovered) IP Address : 10.0.11.49

Mbr          
ID   Mac Address   Status
--- -------------- ----------
1   1cc1de-4d87c0  Commander
2   1cc1de-4dc740  Member
3   1cc1de-4dbd40  Member

Inactive Stack Fragment(discovered) IP Address : 10.0.10.98

Mbr      
ID  Mac Address    Status
--- -------------- ----------
5   1cc1de-4da900  Commander

Member 5 is up, but is an inactive fragment. It has an addressable IP address, which can be used to connect to this fragment.

Connecting to a stack fragment

HP Stack 3800#: telnet 10.0.10.98 oobm
HP J9576A 3800-48G-4SFP+ Switch
Software revision KA.15.03.0000x
Copyright (C) 1991-2011 Hewlett-Packard Development Company, L.P.
RESTRICTED RIGHTS LEGEND
Confidential computer software. Valid license from HP required for possession,
use or copying. Consistent with FAR 12.211 and 12.212, Commercial
Computer Software, Computer Software Documentation, and Technical
Data for Commercial Items are licensed to the U.S. Government under
vendor's standard commercial license.
HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P.
20555 State Highway 249, Houston, TX 77070

Enter the show stacking command.

Viewing missing stack members

HP Stack 3800#: show stacking

Stack  Topology      : Chain
Stack  Status        : Fragment Inactive
Uptime               : 0d 0h 7m
Software  Version    : KA.15.03.0000x

Mbr
ID  Mac Address   Model                                  Pri  Status
--- ------------- -------------------------------------- --- ---------------
1   1cc1de-4d87c0 HP J9573A 3800-24G-PoE+-2SFP+ Switch   128  Missing
2   1cc1de-4dc740 HP J9573A 3800-24G-PoE+-2SFP+ Switch   128  Missing
3   1cc1de-4dbd40 HP J9575A 3800-24G-2SFP+ Switch        128  Missing
4   1cc1de-4d79c0 HP J9576A 3800-48G-4SFP+ Switch        128  Missing
5   1cc1de-4da900 HP J9576A 3800-48G-4SFP+ Switch        200  Commander

Confirm by entering the oobm discovery command. Member 4 is down.

Confirming stack member 4 is down

HP Stack 3800#: show oobm discovery

Inactive Stack Fragment(discovered) IP Address: 10.0.10.98

Mbr  
ID   Mac Address   Status
--- -------------- ----------
5   1cc1de-4da900 Commander

Active Stack Fragment(discovered) IP Address: 10.0.11.49

Mbr  
ID   Mac Address   Status
--- -------------- ----------
1   1cc1de-4d87c0  Commander
2   1cc1de-4dc740  Member
3   1cc1de-4dbd40  Member

Using fault recovery/troubleshooting tools

Stacking provides tools and logging information to aid in troubleshooting problems specific to stacking. Problems may include:

  • Installation/deployment issues

  • Problems with initial stack creation

    • Problems with adding or removing members

    • Booting an existing stack

  • Stacking failures encountered while running an existing stack

The tools used in troubleshooting problems are:

  • Event Log

  • Show commands

    • Show stacking

    • Show system

    • Show boot history

  • Show tech

  • LEDs

Troubleshooting installation and deployment issues

Installation and deployment issues include the initial deployment or creation of a stack, adding additional members or removing members from a stack.

Problem:

When using the Deterministic method, one or more of the statically provisioned members did not join the stack.

Possible reasons a switch does not join a stack are:

  • The switch being added is already a member of another stack and has a different stack ID.

  • The maximum number of switches is already configured.

  • The switch being added has been statically provisioned. The MAC address matches, but the switch type does not.

  • There is a problem with the stack cable.

  • The stack cables are connected in a way that creates an unsupported topology.

  • Stack module failure.

Solution:

Perform a diagnostic fingerprint.

Troubleshooting issues with adding or removing members in the stack

Problem:

Cannot add a new switch to an existing stack.

Solution:

Identify root cause. Possible reasons for a member not joining an existing stack are:

  • The switch being added has already been a member of another stack and has a different stack ID.

  • The maximum number of switches is already configured.

  • The switch being added has been statically provisioned, but switch type and MAC address in the configuration do not match the switch being added.

  • There is a problem with the stack cable.

  • There is a problem with the stack physical cabling. (illegal topology).

Problem:

The entire stack does not come up after a boot.

Solution:

There are several reasons why all members do not join the stack:

  • There is a problem with the stack cable.

  • Physical cabling was changed.

  • Stack booted on incorrect configuration.

  • One or more of the switches has a hardware problem (for example, bad power supply, back stacking module, corrupt flash).

Problem:

One or more of the members keeps rebooting and does not join the stack.

Possible reasons:

  • An unresponsive member.

  • Heartbeat loss—a stack that has a member no longer in the stack or a member failing after joining the stack.

  • Illegal topology.

Problem:

After initial boot sequence, the activity and Link LEDs of an interface are not on and the ports are not passing traffic.

Solutions:

  • Identify the “inactive fragment” and provide alternatives for recovery.

  • Verify that all OOBMs are connected so that there is uninterrupted access.

Problem:

After a reboot, the selected Command or Standby are not the expected switches.

Solutions:

Check to see if the log files provide a reason why the Commander and Standby were chosen and which rule they matched.

Troubleshooting a strictly provisioned, mismatched MAC address

When switches are strictly provisioned, it is possible to enter an incorrect type or incorrect MAC address. If this occurs, the switch does not match the intended configuration entry and stacking attempts to add this switch as a new “plug-and-go” switch. If the stacking configuration already has 10 switches, then the “plug-and-go” fails.

The following example shows a stack with 9 members. There is a new J9576A switch that is supposed to be member 4; however, the MAC address was mis-typed, therefore, there is an “opening” for a plug-and-go at member 10. It will join as member 10.

Viewing a stack with 9 members

This shows the stack before boot.

HP Stack 3800(stacking)#: show stacking

Stack ID         : 00031cc1-de4d87c0
MAC Address      : 1cc1de-4dc765
Stack Topology   : Ring021560
Stack Status     : Active
Uptime           : 0d 0h 56m
Software Version : KA.15.05.0000x

Mbr
ID  Mac Address   Model                                 Pri Status
--- ------------- ------------------------------------- --- --------------
1   1cc1de-4d87c0 HP J9573A 3800-24G-PoE+-2SFP+ Switch  200 Standby
2   1cc1de-4dc740 HP J9573A 3800-24G-PoE+-2SFP+ Switch  128 Commander
3   1cc1de-4dbd40 HP J9575A 3800-24G-2SFP+ Switch       128 Member
4   1cc1de-444444 HP J9576A 3800-48G-4SFP+ Switch       175 Not Joined
5   1cc1de-000005 HP J9576A 3800-48G-4SFP+ Switch       128 Not Joined
6   1cc1de-000006 HP J9576A 3800-48G-4SFP+ Switch       128 Not Joined
7   1cc1de-000007 HP J9573A 3800-24G-PoE+-2SFP+ Switch  128 Not Joined
8   1cc1de-000008 HP J9573A 3800-24G-PoE+-2SFP+ Switch  128 Not Joined
9   1cc1de-000009 HP J9574A 3800-48G-PoE+-4SFP+ Switch  128 Not Joined

Viewing a member joining the stack

This shows that, after booting, the switch is joined as member 10.

HP Stack 3800(config)#: show stacking

Stack ID : 00031cc1-de4d87c0
MAC Address : 1cc1de-4dc765
Stack Topology : Mesh
Stack Status : Active
Uptime : 0d 1h 11m
Software Version : KA.15.05.0000x

Mbr
ID  Mac Address   Model                                  Pri Status
--- ------------- -------------------------------------- --- -------------
1   1cc1de-4d87c0 HP J9573A 3800-24G-PoE+-2SFP+ Switch   200 Standby
2   1cc1de-4dc740 HP J9573A 3800-24G-PoE+-2SFP+ Switch   128 Commander
3   1cc1de-4dbd40 HP J9575A 3800-24G-2SFP+ Switch        128 Member
4   1cc1de-444444 HP J9576A 3800-48G-4SFP+ Switch        175 Not Joined
5   1cc1de-000005 HP J9576A 3800-48G-4SFP+ Switch        128 Not Joined
6   1cc1de-000006 HP J9576A 3800-48G-4SFP+ Switch        128 Not Joined
7   1cc1de-000007 HP J9573A 3800-24G-PoE+-2SFP+ Switch   128 Not Joined
8   1cc1de-000008 HP J9573A 3800-24G-PoE+-2SFP+ Switch   128 Not Joined
9   1cc1de-000009 HP J9574A 3800-48G-PoE+-4SFP+ Switch   128 Not Joined
10  1cc1de-4d79c0 HP J9576A 3800-48G-4SFP+ Switch        128 Member

To correct this issue:

  1. Write down the correct MAC address.

  2. Remove the member that was added using plug-and-go with the strictly provisioned, mismatched MAC address as shown in the following example.

  3. Update the strictly provisioned entry with the correct MAC address.

  4. Boot the switch.

Removing a member and updating the entry with a MAC address

HP Stack 3800(config)#: stacking member 10 remove reboot

The specified stack member will be removed from the stack and
its configuration will be erased. The resulting configuration
will be saved. The stack member will be rebooted and join as
a new member. Continue [y/n]? 

Y

HP Stack 3800(config)#: stacking member 4 type J9576A mac-address

Viewing that member 4 joined the stack

This shows that member 4 has joined the stack.

HP Stack 3800(config)#: show stacking

Stack ID         : 00031cc1-de4d87c0
MAC Address      : 1cc1de-4dc765
Stack Topology   : Mesh
Stack Status     : Active
Uptime           : 0d 1h 19m
Software Version : KA.15.05.0000x

Mbr
ID  Mac Address   Model                                  Pri Status
--- ------------- -------------------------------------- --- ---------------
1   1cc1de-4d87c0 HP J9573A 3800-24G-PoE+-2SFP+ Switch   200  Standby
2   1cc1de-4dc740 HP J9573A 3800-24G-PoE+-2SFP+ Switch   128  Commander
3   1cc1de-4dbd40 HP J9575A 3800-24G-2SFP+ Switch        128  Member
4   1cc1de-4d79c0 HP J9576A 3800-48G-4SFP+ Switch        175  Member
5   1cc1de-000005 HP J9576A 3800-48G-4SFP+ Switch        128  Not Joined
6   1cc1de-000006 HP J9576A 3800-48G-4SFP+ Switch        128  Not Joined
7   1cc1de-000007 HP J9573A 3800-24G-PoE+-2SFP+ Switch   128  Not Joined
8   1cc1de-000008 HP J9573A 3800-24G-PoE+-2SFP+ Switch   128  Not Joined
9   1cc1de-000009 HP J9574A 3800-48G-PoE+-4SFP+ Switch   128  Not Joined

Troubleshooting a mismatched stack-ID

Viewing a stack with 3 unjoined switches

This is an example of a stack that has two members with three more members that have been strictly provisioned, following the deterministic method of initial installation.

HP Stack 3800#: show stack

Stack ID         : 00031cc1-de4d87c0
MAC Address      : 1cc1de-4dc765
Stack Topology   : Chain
Stack Status     : Active
Uptime           : 0d 0h 2m
Software Version : KA.15.05.0000x

Mbr
ID  Mac Address   Model                                 Pri Status
--- -----------   ------------------------------------- --- -----------
1   1cc1de-4d87c0 HP J9573A 3800-24G-PoE+-2SFP+ Switch  200 Standby
2   1cc1de-4dc740 HP J9573A 3800-24G-PoE+-2SFP+ Switch  128 Commander
3   1cc1de-4dbd40 HP J9575A 3800-24G-2SFP+ Switch       128 Not Joined
4   1cc1de-4d79c0 HP J9576A 3800-48G-4SFP+ Switch       175 Not Joined
5   1cc1de-4da900 HP J9576A 3800-48G-4SFP+ Switch       128 Not Joined

When powering on switch #:3, it does not join the stack.

The stack ports for the new switch appear online, however, the show stacking command shows that the switch has not been recognized.

Viewing the switch is not recognized

HP Stack 3800(config)#: show stacking stack-ports member 1,2

Member 1

Member Stacking Port  State Peer Member   Peer Port
------ -------------- ----- ------------- -------
1      1              Down  0             0
1      2              Up    2             1
1      3              Down  0             0
1      4              Down  0             0

Member 2

Member Stacking Port  State Peer Member   Peer Port
------ -------------- ----- ------------- -------
2      1              Up    1             2
2      2              Up    0             0
2      3              Down  0             0
2      4              Down  0             0

The show stacking command does not show that the member is “Not Joined.”

A log file indicates that a “topo /hello” was seen from a switch that was not part of the current stack ID. The console of the switch that should have been member 3 shows the following example output.

Viewing output from the “not joined” switch

HP Stack 3800#: show stacking

Stack ID         : 00011cc1-de4dbd40
MAC Address      : 1cc1de-4dbd64
Stack Topology   : Unknown
Stack Status     : Active
Uptime           : 0d 0h 1m
Software Version : KA.15.05.0000x

Mbr
ID  Mac Address   Model                             Pri Status
--- ------------- --------------------------------- --- -------------------
1   1cc1de-4dbd40 HP J9575A 3800-24G-2SFP+ Switch   128 Commander

The output is different if you have an inactive fragment, since this switch can have the configuration from an old stack. In this case, it might be inactive and show ‘missing’ switches from the old configuration. The stack-id value does not match the stack ID of the HP Stack 3800 stacking factory reset.

HP Stack 3800#: stacking factory-reset
Configuration will be deleted and device rebooted,continue [y/n]? 


Y

To join this switch to the other stack, execute the stacking factory-reset command to erase all of the stale stacking configuration information. This command automatically reboots the switch and on its subsequent boot, the switch is able to join the new stack.

Troubleshooting logging

The show logging command troubleshoots problems in stacking.

Syntax

show logging <a|b|r|s|t|m|p|e|w|i|d|filter|option-str|substring ...]>

The options a|r|substring can be used in combination with an event class option.

a

Instructs the switch to display all recorded log events, which includes events from previous boot cycles.

b

Display log events as time since boot instead of date/time format.

r

Instructs the switch to display recorded log events in reverse order (most recent first.)

s

Display AMM and SMM log events.

t

Display log events in granularity in 10 milli seconds.

substring

Instructs the switch to display only those events that match the substring.

The remaining event class options are listed in order of severity - lowest severity first. The output of the command is confined to event classes of equal or higher severity.

Only one of options d|i|w|e|p|m can be used in the command at a time.

The a|r and substring options may be used in combination with an event class option.

m

Display major type of messages.

e

Display error event class.

p

Display major and error type of messages.

w

Display major, error and warning type of messages.

I

Display major, error, warning and information.

d

Display major, error, warning, information and debug messages.

filter

Display log filter configuration and status information.

Option-str

Filter events shown.

Show logging –e

HP-5406zl$ Show logging –e
Keys: W=Warning I=Information
M=Major D=Debug E=Error
---- Event Log listing: Events Since Boot ----
M 10/04/13 11:01:21 00686 system: Test - This message was sent in a group 00000
E 10/04/13 11:01:21 00686 system: Test - This message was sent in a group 00001
---- Bottom of Log : Events Listed = 2 ----

Logging output

HP Stack 3800 #: show logging -r -s
I 10/02/00 00:46:56 02558 chassis: ST1-STBY: Stack port 3 is now on-line.
I 10/02/00 00:46:56 02558 chassis: ST2-CMDR: Stack port 2 is now on-line.

Troubleshooting a strictly provisioned, mismatched type

When the MAC address matches a strictly provisioned configuration, it either matches the configured type and succeeds, or it does not match the type and fails. This is because the MAC address is unique and you cannot have duplicate MAC addresses.

The log messages indicate that this was the type of failure. The information in the log message helps you correct the configuration.

The switch that fails to join automatically reboots. Execute the show stacking command to view the mis-configured entry.

Viewing the mis-configured entry

HP Stack 3800(config)#: show stacking

Stack ID         : 00011cc1-de4d87c0
MAC Address      : 1cc1de-4d87e5
Stack Topology   : Mesh
Stack Status     : Active
Uptime           : 4d 0h 2m
Software Version : KA.15.05.0000x

Mbr
ID  Mac Address   Model                                 Pri Status
--- ------------- ------------------------------------- --- ----------------
1   1cc1de-4d87c0 HP J9573A 3800-24G-PoE+-2SFP+ Switch  128 Commander
2   1cc1de-4dc740 HP J9573A 3800-24G-PoE+-2SFP+ Switch  128 Standby
3   1cc1de-4dbd40 HP J9575A 3800-24G-2SFP+ Switch       128 Member
4   1cc1de-4d79c0 HP J9576A 3800-48G-4SFP+ Switch       175 Member
5   1cc1de-4da900 HP J9575A 3800-24G-2SFP+ Switch       128 Not Joined

The configuration entry for member 5 matches a J9576A switch that will be added, however, it will fail because it is configured as a J9575A switch.

The following example shows the log entries with the failure to join the stack.

Log entries Viewing stacking failures

W 10/06/00 03:24:37 03255 stacking: ST2-STBY: Provisioned switch with Member ID
5 removed due to loss of communication
I 10/06/00 03:24:37 02558 chassis: ST2-STBY: Stack port 4 is now on-line.
I 10/06/00 03:24:35 02558 chassis: ST4-MMBR: Stack port 2 is now on-line.
W 10/06/00 03:24:35 03274 stacking: ST1-CMDR: Member 5 (1cc1de-4da900) cannot
join stack due to incorrect product id: J9576A

You cannot re-type the configuration command with the same MAC address, member ID, and a different J-number. Remove the configuration and then reconfigure this switch member entry.

Removing a stack member and reconfiguring

HP Stack 3800(config)#: stacking member 5 remove

The specified stack member configuration will be erased. The
resulting configuration will be saved. Continue [y/n]? 

Y

HP Stack 3800(config)#: stacking member 5 type J9576A mac 1cc1de-4da900
This will save the current configuration. Continue [y/n]? 

Y

Stack ID : 00011cc1-de4d87c0

tty=ansi HP Stack 3800(config)#: show stacking
Strictly provisioned: Mis-Matched Type

Stack ID         : 00011cc1-de4d87c0
MAC Address      : 1cc1de-4d87e5
Stack Topology   : Mesh
Stack Status     : Active
Uptime           : 4d 0h 35m
Software Version : KA.15.05.0000x

Mbr
ID  Mac Address   Model                                 Pri Status
--- ------------- ------------------------------------- --- ---------------
1   1cc1de-4d87c0 HP J9573A 3800-24G-PoE+-2SFP+ Switch  128 Commander
2   1cc1de-4dc740 HP J9573A 3800-24G-PoE+-2SFP+ Switch  128 Standby
3   1cc1de-4dbd40 HP J9575A 3800-24G-2SFP+ Switch       128 Member
4   1cc1de-4d79c0 HP J9576A 3800-48G-4SFP+ Switch       175 Member
5   1cc1de-4da900 HP J9576A 3800-48G-4SFP+ Switch       128 Not Joined

Boot the switch with the matching MAC/Type.

Viewing joined stack members

HP Stack 3800(config)#: show stacking

Stack ID         : 00011cc1-de4d87c0
MAC Address      : 1cc1de-4d87e5
Stack Topology   : Mesh
Stack Status     : Active
Uptime           : 4d 0h 50m
Software Version : KA.15.05.0000x

Mbr
ID  Mac Address   Model                                 Pri Status
--- ------------- ------------------------------------- --- ----------------
1   1cc1de-4d87c0 HP J9573A 3800-24G-PoE+-2SFP+ Switch  128 Commander
2   1cc1de-4dc740 HP J9573A 3800-24G-PoE+-2SFP+ Switch  128 Standby
3   1cc1de-4dbd40 HP J9575A 3800-24G-2SFP+ Switch       128 Member
4   1cc1de-4d79c0 HP J9576A 3800-48G-4SFP+ Switch       175 Member
5   1cc1de-4da900 HP J9576A 3800-48G-4SFP+ Switch       128 Member

Troubleshooting maximum stack members exceeded

This failure can happen if you have an active stack that has already reached its maximum number of members. It can also happen when the maximum number of switches is reached with a combination of active members and strictly provisioned members.

Since one of the suggested initial deployment techniques is a deterministic method using strictly provisioned entries, this failure example demonstrates what occurs if the maximum number of members is reached by strictly provisioning 10 members. At least one of these configuration entries has an incorrect MAC addresses. Similar to the mismatched MAC address example, the stack attempts to “plug-and-go” to add the switch, however, since the maximum number of membership has already been reached, the switch cannot join the stack.

The following example shows the show stacking output before the switch attempts to join.

Viewing stack members before the join

HP Stack 3800(config)#: show stacking

Stack ID         : 00031cc1-de4d87c0
MAC Address      : 1cc1de-4dc765
Stack Topology   : Mesh
Stack Status     : Active
Uptime           : 0d 1h 27m
Software Version : KA.15.05.0000x

Mbr
ID  Mac Address   Model                                  Pri Status
--- ------------- -------------------------------------- --- ---------------
1   1cc1de-4d87c0 HP J9573A 3800-24G-PoE+-2SFP+ Switch   200 Standby
2   1cc1de-4dc740 HP J9573A 3800-24G-PoE+-2SFP+ Switch   128 Commander
3   1cc1de-4dbd40 HP J9575A 3800-24G-2SFP+ Switch        128 Member
4   1cc1de-4d79c0 HP J9576A 3800-48G-4SFP+ Switch        175 Member
5   1cc1de-000005 HP J9576A 3800-48G-4SFP+ Switch        128 Not Joined
6   1cc1de-000006 HP J9576A 3800-48G-4SFP+ Switch        128 Not Joined
7   1cc1de-000007 HP J9573A 3800-24G-PoE+-2SFP+ Switch   128 Not Joined
8   1cc1de-000008 HP J9573A 3800-24G-PoE+-2SFP+ Switch   128 Not Joined
9   1cc1de-000009 HP J9574A 3800-48G-PoE+-4SFP+ Switch   128 Not Joined
10  1cc1de-00000a HP J9574A 3800-48G-PoE+-4SFP+ Switch   128 Not Joined

When a switch that does not match the MAC addresses attempts to join, that switch reboots when the maximum configuration is detected. The active stack logs the following:

W 10/07/00 06:01:11 03253 stacking: ST3–CMDR: Maximum number of switches in the stack has been reached. Cannot add 1cc1de-4da900 type J9576A

The failure can be due to one of the strictly provisioned entries being incorrect. To correct this entry, reboot the switch. If there are already 10 switches in the stack, you cannot add additional switches at this time.

Troubleshooting a bad cable

Bad cables can cause the stack port to flap or go down completely. If there are an excessive number of port flaps, the port is disabled and the following log message appears:

W 10/06:00 23:23:16 03260 chassis: ST4–CMDR: Stack port 1 disabled due to excessive errors.Check cable.  To reenable use 'stacking member 4 stack-port 1 enable'.

When this occurs, the show stacking stack-ports command shows the port with a status of “Disabled”.

Viewing a disabled stack port

HP Stack 3800$ show stacking stack-ports

Member   Stacking  Port State  Peer Member Peer Port
------------------------------------------------------
1          1         Up           5            2
1          2         Up           2            1
1          3         Up           3            3
1          4         Up           4            3
2          1         Up           1            2
2          2         Up           3            1
2          3         Up           4            4
2          4         Up           5            3
3          1         Up           2            2
3          2         Down         0            0
3          3         Up           1            3
3          4         Up           5            4
4          1         Disabled     0            0
4          2         Up           5            1
4          3         Up           1            4
4          4         Up           2            3
5          1         Up           4            2
5          2         Up           1            1
5          3         Up           2            4
5          4         Up           3            4

If the cable failure is more solid, the port is in the DOWN state. The logs show any transition.

I 10/07/00 06:01:15 02559 chassis: ST4-STBY: Stack port 3 is now off-line.I 10/07/00 06:01:16 02559 chassis: ST3-CMDR: Stack port 3 is now off-line.I 10/07/00 06:01:16 02559 chassis: ST2-MMBR: Stack port 1 is now off-line.I 10/07/00 06:01:15 02559 chassis: ST5-MMBR: Stack port 2 is now off-line.I 10/07/00 06:01:15 02558 chassis: ST2-MMBR: Stack port 1 is now on-line.I 10/07/00 06:01:12 02558 chassis: ST5-MMBR: Stack port 2 is now on-line.I 10/07/00 06:01:10 02558 chassis: ST4-STBY: Stack port 3 is now on-line.

The following example shows member 3, port 2, which should be connected to member 4, port 1. The ports are down because the cable is bad or disconnected.

Viewing that two ports are down due to a bad connection

HP Stack 3800#: show stacking stack-ports

Member   Stacking  Port State  Peer Member Peer Port
------------------------------------------------------
1          1         Up           5            2
1          2         Up           2            1
1          3         Up           3            3
1          4         Up           4            3
2          1         Up           1            2
2          2         Up           3            1
2          3         Up           4            4
2          4         Up           5            3
3          1         Up           2            2
3          2         Down         0            0
3          3         Up           1            3
3          4         Up           5            4
4          1         Down         0            0
4          2         Up           5            1
4          3         Up           1            4
4          4         Up           2            3
5          1         Up           4            2
5          2         Up           1            1
5          3         Up           2            4
5          4         Up           3            4

The solution in both cases is to ensure that the cable is firmly connected at both ends. If the problem continues, replace the cable. It is possible that there could be a problem with the stack port itself. In this case, validation of this issue requires the installation of a known good cable to see if that cable also fails.

The port state is not UP until both ends of the cable are connected and the cable has been validated as a genuine HP cable.

To view the statistics on the physical port, execute the show tech command in member-context 4. The following examples show the types of information displayed.

Viewing show tech output

Port Number : 1                    State : Available
Last Event : Available             Start Req : 1
NE Present : 1                     HPID Good : 1
HPID Fails : 0                     FE Present : 1
Rem Dev Rdy : 1
ESSI Link : 1                      ESSI Good : 1
ESSI Fails : 0                     ESSI TX En : 1
ICL Good : 1                       ICL Enabled : 1
LP Local RDY: 1                    LP Rem RDY : 1
LP DONE : 1                        ICL FailCnt : 0 (10 second interval)
ICL FailCnt : 0 (10 minute interval)
NE Presence HW : 1
FE Presence HW : 1
Rem Dev Rdy HW : 1
Local Dev Rdy HW : 1
Asserted NE Presence HW : 1
Asserted FE Presence HW : 1
Asserted Rem Dev Rdy HW : 1
Phy Frame Errors : 0
Invalid Status Errors : 0
Invalid Packet Type Errors : 0
Incomplete Packet Errors : 0
Checksum Errors : 0
ESSI Flow Out This Port (HW) : 0x2

Viewing trace information for a port

Trace for Port 1
[ 0] [Info ] Start Request Received (Empty) [0]
[ 1] [Info ] Waiting for Stack Module Good (Empty) [0]
[ 2] [Info ] Stack Module Good Received (Empty) [0]
[ 3] [Info ] Cable Insertion Detected (Empty) [1]
[ 4] [Info ] Re-enable NE Present Int [487]
[ 5] [Info ] Starting Cable HPID Validation (Inserted) [488]
[ 6] [Info ] Skipping Cable HPID Validation (Inserted) [488]
[ 7] [Info ] Far End Insertion Detected (Valid) [988]
[ 8] [Info ] Polling for ESSI phy link up (Valid) [988]
[ 9] [Info ] ESSI Link Up [9988]
[10] [Info ] ESSI Link Good (Valid) [9988]
[11] [Info ] ESSI Linked at 9988 ms [9988]
[12] [Info ] Remote Device Ready Detected (Valid) [10898]
[13] [Info ] ICL Change Request Enable (Cable Ready) [10898]
[14] [Info ] Detected Remote Ready Drop. (Cable Ready) [12651]
[15] [Info ] ICL Good. Behind. Partner ready. (Cable Ready) [12988]
[16] [Info ] ICL GOOD received at 2091 ms [12988]
[17] [Info ] Partner LP ready. (Cable Ready) [13980]
[18] [Info ] Set Device Ready. (Cable Ready) [13987]
[19] [Info ] ESSI Link Verified [13988]
[20] [Info ] ESSI Able to Transmit (Cable Ready) [13988]
[21] [Info ] ESSI Verified at 3091 ms [13988]
[22] [Info ] Cable Available (Available) [13988]

Troubleshooting when a switch crashes and reboots

Although the switch software is highly reliable, a switch in the stack can experience a software issue that results in the crash and reboot of that switch. This crash can happen in the software running on the CPU in the management CPU or on the software running on the CPUs in the interfaces. In either case, crash information is generated and the switch is rebooted.

The resiliency of the stack is determined by the stacking topology, however, in all cases, the interfaces/ports on the switch that crashes are brought down and a reboot of that switch occurs.

The following table describes how the stack reacts to the crashing switch, depending on what role the switch had when the crash occurred. The assumption in this table is that the topology is a resilient topology (that is, a mesh or ring).

Stacking role Description
Commander
  • The standby takes over as the new Commander

  • A new standby is elected

  • Crashing switch writes core file to local stable storage

  • Crashing switch reboots and joins the stack

  • Core file and crash information for this switch is available from the Commander

Standby
  • A new standby is elected

  • Crashing switch writes core file to local stable storage

  • Crashing switch reboots and joins the stack

  • Core file and crash information for this switch is available from the Commander

Member
  • Crashing switch writes core file to local stable storage

  • Crashing switch reboots and joins the stack

  • Core file and crash information for this switch is available from the Commander

After a switch crashes, you can collect data to help HP understand why the crash occurred. The information is a combination of crash data, crash log, and core-dump files. The show tech command displays logs of events that happened right before the crash.

Troubleshooting an unresponsive reboot

An unresponsive reboot occurs when a member does not respond to an update packet.

Reboot output

 /* SSM_SWITCH_LOST_EVENT */
// 30 States -> Initial !Discovery !Bid for Cmdr !Become cmdr !
/* Switch Lost */{ssmIgnore ,ssmMbrRmvMbr ,ssmMbrRmvMbr ,ssmMbrRmvCmdr ,
// States -> Cmdr chas wait!Cmdr RFS start! Commander ! Cmdr Merge !
/* Switch Lost */ ssmMbrRmvCmdr ,ssmMbrRmvCmdr ,ssmMbrRmvCmdr ,ssmMbrRmvCmdr ,
// States -> Wait for chas !wait for type !stby RFS start!stby RFS sync !
/* Switch lost */ ssmMbrRmvMbr ,ssmMbrRmvMbr ,ssmMbrRmvMbr ,ssmMbrRmvMbr ,
// States -> Standby !Mbr RFS wait ! Member !Pass through !
/* Switch Lost */ ssmMbrRmvStby ,ssmMbrRmvMbr ,ssmMbrRmvMbr ,ssmIllegal },

Troubleshooting an unexpected Commander or Standby switch selection

When a switch stack is established and a boot/reboot of the stack is performed, the Commander and Standby are selected based on the configured switch priority. There are other rules in the election process that can override this priority.

Viewing the running configuration with priority

HP Switch(config)#: show running-config

; hpStack Configuration Editor; Created on release #:KA.15.05.0000x
; Ver #:01:00:01

hostname "HP Stack 3800"
stacking
member 1 type "J9573A" mac-address 1cc1de-4d87c0
member 2 type "J9573A" mac-address 1cc1de-4dc740
member 3 type "J9575A" mac-address 1cc1de-4dbd40
member 3 priority 200
member 4 type "J9576A" mac-address 1cc1de-4d79c0
member 4 priority 175
member 5 type "J9576A" mac-address 1cc1de-4da900exit

On a boot of the stack, member 3 becomes a Commander and member 4 becomes a Standby, based on priority. If this were a chain with member 1 at one end of the chain and member 5 at the other end, the number of hops between switches will be part of the election process.