Worker or master nodes are not in "READY" state

Symptom

Nodes coming up without designated hostnames or the nodes not getting to "Ready" state.

Solution 1
Cause

DHCP service was inactive at the time the node was booted.

Action

Login to NPS toolkit VM and do the below steps:

  1. Verify the image-service pod is running, using the following command:
    kubectl get pods -n nps
  2. If the image-pod is not running, run the following command to bring up the pod:
    nps baremetal -a install -nos <nos_type> -l debug
    Where,

    <nos_type> is cumulus or aruba.

Solution 2
Cause

In HPE TICG, the LACP configuration values set for servers are mismatched with the LACP configuration on the data switches.

Action
  1. Log in to the worker or master node using ssh.
  2. Navigate and open the following file to check the details of the LACP configuration:
    cat /etc/sysconfig/network-scripts/ifcfg-bond0
  3. Log in to the data switches and run the following command to verify the LACP configuration set on the switch:
    cat /proc/net/bonding/<bond_name>
  4. Check for the following:
    • Mismatch in the LACP_RATE values

    • No entry for UP_DELAY parameter on the server

  5. In case of any mismatches, log in to the NPS toolkit VM and update the pxe file with the appropriate values:
    cd /var/nps/ISO/*.ipxe
  6. Reboot the servers and press F12.