Installing image service on RHEL nodes is not complete

Symptom

The installation status is failed or Not done during the installation of image service on RHCOS nodes.

  1. Do the following:

    1. Check that the image service pod (image-service-0) is in running state using the following command:

      kubectl get pods -n nps
    2. Check the /var/nps/logs/nps_error.log file if the image service is not running.

  2. Check the baremetal service status using the command:

    nps show --service baremetal
  3. Use solution one, if the image-service-0 pod is running and installation status is Not done.

  4. Use solution two, if installation status is failed.

Solution 1
Action
  1. Check whether FLR NIC ports of the corresponding iLO's are in "connected" state. If not check the physical connectivity between FLR NIC port and data switch port.
  2. If auto deploy is used for installation, after connecting the ports, and run the nps autodeploy command again.
  3. If manual procedure is used for installation, after connecting the ports, do the following:
    1. To install operating system through image service, trigger onetime pxeboot for the failed servers using following command.
      nps baremetal -a temp-pxeboot -sl <iLO IP1, iLO IP2> -l debug

      Wait till the OS is installed.

    2. Verify the baremetal installation status again using the command:
      nps show --service baremetal
Solution 2
Action
  1. Run the following command:
    nps show --data servers --node <iLO IP of failed server>
  2. Get the temp_ip under state from the output of step 1.

    This is the OCP IP address which will be assigned to the host.

  3. Ping the temp_ip.
  4. If no response received from the ping, login to the iLO console and check if the operating system is installed.
  5. If the operating system is not installed:
    1. Log in to the HPE StoreFabric SN2100M and run the following command to verify if peer connectivity is established.
      net show clag
    2. If the clag is not formed, verify the installed cumulus version is as described in the compatibility matrix.
    3. If the cumulus version is incorrect, see Cumulus version is incorrect to continue the image service.