Restoring a controller from a backup

Restore operation


NOTE: To restore a controller from a backup, it is necessary to re-install the controller.


  • In a controller team environment each active controller is restored as a single system.

  • When the controller is deployed in a VM, standard VM restore tools (such as Snapshot or Clone) can be used.

  • When the controller is deployed on bare metal, standard Linux server-based backup/restore tools (such as rsync, LVM snapshot, and Amanda/Zmanda) can be used.

  • If a backed-up controller in a team fails, use single-system restore to restore the controller. The HA synchronization updates the controller to the latest version.

  • The controller blocks traffic over OpenFlow ports during a restore.


NOTE: The controller ceases to operate during a Restore operation.


System restore requirements

A system backup can be restored only to a system having the following:

  • The same controller version that existed at the time the backup was taken.

  • The same network settings (IP address) as were present at the backup.

  • The same license ID as was in effect when the controller was installed.


NOTE: If you have modified any environment—specific settings in files such as /opt/sdn/virgo/options or /etc/init/sdnc.conf, ensure that the appropriate changes are made to these files after you re-install the controller and before you start the restore. For example, the network interface that the Virgo service uses (default: eth0) might be eth1 or another setting on some systems.


Restoring a controller from a backup

  1. Uninstall the controller(s) to be restored. If this is a rollback to a previous state, uninstall all controllers.

  2. Before restoring a controller, set CTL_RESTORE_INSTALL_MODE=True in the ~/.sdn_install_options file in the home directory. If this file is not present in the directory, create it with the CTL_RESTORE_INSTALL_MODE entry. If the file is already present, ensure that it includes the CTL_RESTORE_INSTALL_MODE entry. This entry directs the installer to perform the necessary changes to direct the controller to start in recovery/restore mode, during which OpenFlow activity is suspended for the subject controller.

  3. Re-install the failed controller(s), making sure to use the same IP address configuration. During the re-installation, log messages similar to the following appear in the Audit Log:

    root@mak:~/dev/controller/dist# dpkg -i hp-sdn-ctl_1.11_amd64.deb
    Selecting previously unselected package hp-sdn-ctl.
    (Reading database ... 212350 files and directories currently installed.)
    Unpacking hp-sdn-ctl (from hp-sdn-ctl_1.11_amd64.deb) ...
    Setup has detected a compatible jre-headless - 1.7.0_25 
    Creating system group 'sdn'... 
    ...done. 
    Creating system user 'sdn'...
    ...done. 
    Creating system user 'sdnadmin'...
    ...done. 
    Configuring PostgreSQL database...
    * Restarting PostgreSQL 9.1 database server [ OK ]
    ...done.
    Adding SDN-related items to Keystone... 
    keystone stop/waiting
    keystone start/running, process 11514
    ...done.
    Setting up hp-sdn-ctl (1.11) ... 
    Certificate was added to keystore
    CTL_RESTORE_INSTALL_MODE option is set 
    SDN controller will be started in restore mode
    sdna start/running, process 11633 
    sdnc start/running, process 11636
    Processing triggers for ureadahead ...
    

    CAUTION: Do not re-install any applications before you complete the restore process. The restoration adds data from the backup file into the current database contents. If you re-install applications that are part of the controller backup, then those applications might end up with duplicate or conflicting entries in their database. If required, only re-install applications after you have completed all steps of the restore process.


  4. Acquire the authentication token for the system restore:

    curl --noproxy <controller_ip>" -X POST --fail -ksSfL --url "https://<controller_ip>:8443/sdn/v2.0/auth" -H "Content-Type: application/json" --data-binary '{"login": {"domain": "<domain>","user": "<user>","password": "<password>"}}'


    CAUTION: Credential information (user name, password, domain, and authentication tokens) used in cURL commands might be saved in the command history. For security reasons, HP recommends that you disable command history prior to executing commands containing credential information.


  5. Acquire the controller uid:

    curl --noproxy controller_ip 
    --header "X-Auth-Token:auth_token" --fail -ksSfL --request GET 
    --url https://controller_ip:8443/sdn/v2.0/systems
  6. Use the following cURL command to set the IP address:

    curl --noproxy controller_ip --header "X-Auth-Token:auth_token" --fail -ksSfL --request PUT 
    "https://controller_ip:8443/sdn/v2.0/systems/controller_uid" 
    --data-binary '{"system":{"ip":"controller_ip"}}'
  7. Perform a single controller restore onto each controller needing restoration.

    1. Upload the backup files that will be restored:

      
      curl --noproxy controller_ip -X POST --fail -ksSfL 
      --url "https://controller_ip:8443/sdn/v2.0/restore backup" 
      -H "X-Auth-Token:auth_token"--data-binary @path-and-file-name.zip
      

      where path-and-file-name is the full path to the file and the filename. The filename MUST match the name you used during the backup.

    2. Initiate the restore:

      curl --noproxy controller_ip --header "X-Auth-Token:auth_token" --fail -ksS --request POST --url 
      "https://controller_ip:8443/sdn/v2.0/restore”
  8. For a controller team, wait for HA synchronization to complete to all the controllers and wait for the team to become connected. The team can take a few minutes to come back up. Be sure to verify that team status has all controllers as ACTIVE and one of the team members is a leader.

    curl --noproxy controller_ip 
    --header "X-Auth-Token:auth_token" --fail -ksSfL --request GET 
    --url "https://controller_ip:8443/sdn/v2.0/systems
    • If less than a quorum of controllers are restored, then those controllers are updated to the latest state of the running team via HA synchronization. (A quorum is n/2+1 where n is the total number of controllers in a team. In a three-controller team, a quorum is two controllers.)

    • If the entire team is restored, then each controller is reset to the previous backed-up state.

  9. After the controller restore is complete, change the value of CTL_RESTORE_INSTALL_MODE to false in the ~/.sdn_install_options file on each controller so that it does not impact a future installation. This is because a future installation of the controller might not involve starting in recovery mode. (This is the opposite of step 2 of “Restoring a controller from a backup ”.)

  10. It is possible to query the restore status by using the get command at v2.0/restore/status. Since the restore is not hitless, the REST query will fail until the controller has successfully restarted.


    NOTE: To restore a controller team, restore each controller as a standalone controller. See “Distributed (team) backing up and restoring ”.



    NOTE: Attempting to restore a backup taken on any release prior to version 2.3 will not complete.