Add RHCOS worker nodes

Prerequisites
  • Run all the commands from NPS toolkit VM.

  • Ensure that the proxy is not set in the environment variables.

  • Ensure that the switches are configured. For more information, see Configuring switch for newly added servers.

Procedure
  1. Run the following command from NPS toolkit VM and write down the bootstrap uuid and iLO IP address:
    nps show --data servers
  2. Run the following command from NPS toolkit VM to get the existing custom data for bootstrap VM:
    nps show --data servers --node <bootstrap node ilo ip>
  3. Run the following command to generate the token from NPS toolkit VM:
    curl -i -k -X POST \
       -H "Content-Type:application/json" \
       -d \
    '{"username":"<username>","password":"<password>","topology_name":"topolgy_name"}' \
     'https://<NPS toolkit VM ip> /nps/v2/tokens/' 
    
  4. Patch the bootstrap hostname to worker host name:
    curl -i -k -X PATCH \
       -H "X-Auth-Token:<token>" \
       -H "Content-Type:application/json" \
       -d \
    '{"custom_data":{
    "os_type": "rhcos",
    "hostname": "xxx.xxx.xxx",
    "host_ipaddress": "xx.xx.xx.xx"
    }}' \
     'https://<NPS toolkit VM ip>/nps/v2/infrastructures/servers/<serveruuid>/'
  5. Update the role of bootstrap node as worker
    Run the following command to update bootstrap role to worker role:
    nps patch -c worker -i <bootstrap_ip> -f <patch_data_jsonfile>
    For example, JSON file must contain the data as shown.
    {“role”:”worker”}
    or
    nps patch -c worker -i <bootstrap_ip> -d ‘{“role”:”worker”}’

    Where, worker_ip is the iLO IP address of the server with role as bootstrap.

  6. To bring up "image-service" pod for the installation of RHCOS on worker node, run the following command from NPS toolkit VM.

    nps baremetal -a install -nos <nos_type> -l debug

    Where,

    nos - is Network Operating System and the supported <nos_type> is cumulus or aruba.
    NOTE:

    For any installation issue, check the log file /var/nps/logs/<topology_name>/baremetal_install.log and take necessary steps.

  7. Run the following command from NPS toolkit VM and write down the worker node iLO IP addresses with os_type as rhcos:
    nps show --data servers
  8. Run the following command to power on the worker node with one-time PXE boot:
    nps baremetal -a temp-pxeboot -sl <rhcos_worker1_ILOIP, rhcos_worker2_ILOIP> -l debug

    where rhcos_worker1_ILOIP, and rhcos_worker2_ILOIP are the ILO IP addresses of the RHCOS worker nodes

    The worker node is powered on and the installation starts.
    NOTE:

    For any RHCOS installation (installing operating system) issue on worker node, see Installing image service on RHCOS nodes is not complete .

  9. Approve CSR certificate
  10. Verify and update the status of the node using the following command:
    nps show --service baremetal
  11. Patch image registry storage to an empty directory temporarily, login to NPS toolkit VM, and run the following commands :
    export KUBECONFIG=/var/nps/ISO/ign_config/auth/kubeconfig
    oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"storage":{"emptyDir":{}}}}'
  12. Perform the following steps to verify the worker node:
    1. After the CSR is approved for worker nodes, login to cluster as a default system user by exporting the cluster kubeconfig file:
      export KUBECONFIG=/var/nps/ISO/ign_config/auth/kubeconfig
    2. Run the following command to check if worker nodes has Ready as the STATUS and worker as the ROLES for worker nodes:
      [root@npsvm rhocp]# oc get nodes
      NAME                   STATUS   ROLES    AGE   VERSION
      master-01.c10.qa.com   Ready    master   12h   v1.16.2+45a4ac4
      master-02.c10.qa.com   Ready    master   12h   v1.16.2+45a4ac4
      master-03.c10.qa.com   Ready    master   12h   v1.16.2+45a4ac4
      worker-01.c10.qa.com   Ready    worker   12h   v1.16.2+45a4ac4
      worker-02.c10.qa.com   Ready    worker   12h   v1.16.2+45a4ac4
    3. Run the following command from NPS toolkit VM:
      openshift-install --dir=<installation_directory> wait-for install-complete

      Replace <installation-directory> with the path where ignition files are generated.

      The command succeeds when the Cluster Version Operator finishes deploying the OpenShift Container Platform cluster from Kubernetes API server.

    4. Replace <installation-directory> with the path where ignition files are generated.
    5. Move dhcpd.conf to switch.
  13. Log in to NPS toolkit VM and copy /var/nps/ISO/dhcpd.conf file to both the HPE StoreFabric SN2100M in to the following location /etc/dhcp/.
  14. Enable and start the DHCP service in both the data switches using the following commands:
    systemctl enable dhcpd 
    systemctl start dhcpd
  15. Delete the image service pod for RHCOS using the following command:
    nps baremetal -a delete -nos <nos_type>

    Where, nos is Network Operating System and the supported <nos_type> is cumulus or aruba.