Initiating a Failback (Move)

Failing Back from Azure : Initiating a Failback (Move)

You can initiate a Failback (Move), whereby the virtual machines in the virtual protection group (VPG) are replicated to a set checkpoint in the recovery site. As part of the process you can also set up reverse protection, whereby you create a VPG on the recovery machine for the virtual machines being replicated.

You can initiate a Failback (Move) to the last checkpoint recorded in the journal, even if the protected site is no longer up. You can initiate a Failback (Move) during a test.

If the protected site is operational, the Failback (Move) can be initiated from the protected site, however if the protected site is down, you can initiate the Failback (Move) from the recovery site.

To initiate a Failback:

1. In the Zerto User Interface select ACTIONS > MOVE VPG.

The Move wizard is displayed.

2. Select the VPGs to failback. By default, all VPGs are listed.

At the bottom, the selection details show the amount of data and the total number of virtual machines selected.

The Direction arrow shows the direction of the process: From the protected site To the peer, recovery, site.

3. Click NEXT.

The EXECUTION PARAMETERS step is displayed.

You can change the following values to use for the recovery:

■ Commit Policy

■ Force Shutdown: Select Yes / No. Selecting YES places the VMs in a Stopped (Deallocated) state.

■ Reverse Protection settings. For more information see Reverse Protection For a Moved VPG.

■ Keep Source VMs: Prevents removal of the protected virtual machines at the Azure site.

Note: If reverse protection is specified, the Keep Source VMs option is grayed out.

4. To change the commit policy, click the field or select the VPG and click EDIT SELECTED.

a) To commit the recovery operation automatically, with no testing, select Auto-Commit and 0 minutes.

b) Select None if you do not want an automatic commit or rollback. You must manually commit or roll back.

c) To test before committing or rolling back, specify an amount of time to test the recovered machines, in minutes.

This is the amount of time that the commit or rollback operation is delayed, before the automatic commit or rollback action is performed.

During this time period, check that the new virtual machines are OK and then commit the operation or roll it back.

The maximum amount of time you can delay the commit or rollback operation is 1440 minutes, which is 24 hours.

Testing that involves I/O is done on scratch volumes.

■ The more I/Os generated, the more scratch volumes are used, until the maximum size is reached, at which point no more testing can be done.

■ The maximum size of all the scratch volumes is determined by the journal size hard limit and cannot be changed.

■ The scratch volumes reside on the storage defined for the journal.

5. To specify reverse protection, whereby the virtual machines in the VPG are failed over to the recovery site and then protected in the recovery site, back to the original site, either:

■ Click REVERSE PROTECT ALL. This activates reverse protection on all the VPGs that you plan to Failback. The system default values for this procedure will be assigned to all the VPGs.

-or-

■ Click the Reverse Protection field.

If you want to configure the VPG for reverse protection, click the REVERSE link.

The Edit Reverse VPG window is displayed and you can edit the reverse protection configuration. The parameters are the same as described when you create a VPG, with the following differences:

■ You cannot add or remove virtual machines in the reverse protected VPG.

■ By default, reverse protection is to the original protected disks. You can specify a different storage to be used for the reverse protection.

■ If VMware Tools or Microsoft Integration Services are available for vSphere or Hyper-V respectively, for each virtual machine in the VPG, the IP address of the originally protected virtual machine is used. Thus, during failback the original IP address of the virtual machine on the site where the machine was originally protected is reused. However, if the machine does not contain the utility, DHCP is used. The host version must be 4.1 or higher for re-IP to be enabled.

When committing the Failback, you can reconfigure reverse protection, regardless of the reverse protection settings specified here. For more information see Reverse Protection For a Moved VPG.

6. To define whether to use preseed or not, from the Edit Reverse VPG window, select a virtual machine and click Edit Selected.

The Edit Volumes window is displayed.

7. Specify the Volume Source for recovery. Select one of the following options from the drop-down list:

■ Storage account: By selecting this, a new volume is created for the replicated data.

■ Preseeded volume: Select this when you want to copy the protected data to a virtual disk in the recovery site. The path to the protected disk is selected by default.

■ Zerto recommends using this option particularly for large disks so that the initial synchronization will be faster since a Delta Sync can be used to synchronize any changes written to the recovery site after the creation of the preseeded disk.

8. Click OK to close the Edit Volumes window.

9. Click NEXT.

The MOVE step is displayed. The topology will show the VPGs and the virtual machines that are about to be moved from Azure to the original site.

10. Click START MOVE to start the Failback.

11. If a commit policy was set with a timeout greater than zero, as described in step 4, you can check the moved virtual machines on the recovery site before they are removed from the protected site.

Note: If a virtual machine exists on the recovery site with the same name as a virtual machine being migrated, the machine is moved and named in the peer site with a number added as a suffix to the name, starting with the number 1

The status icon changes to orange and an alert is issued, to warn you that the procedure is waiting for either a commit or rollback.

All testing done during this period, before committing or rolling back the Move operation, is written to thin-provisioned virtual disks, one per virtual machine in the VPG. These virtual disks are automatically defined when the machines are created on the recovery site for testing. The longer the test period the more scratch volumes are used, until the maximum size is reached, at which point no more testing can be done. The maximum size of all the scratch volumes is determined by the journal size hard limit and cannot be changed. The scratch volumes reside on the storage defined for the journal. Using these scratch volumes makes committing or rolling back the Move operation more efficient.

Note: You cannot take a snapshot of a virtual machine before the Move operation is committed and the data from the journal promoted to the moved virtual machine disks, since the virtual machine volumes are still managed by the VRA and not directly by the virtual machine. Taking a snapshot of a machine that is in the process of being moved will corrupt that machine.

12. Check the virtual machines on the recovery site, then either:

■ Wait for the specified Commit Policy time to elapse, and the specified operation, either Commit or Rollback, is performed automatically.

■ Or, in the specific VPG tab, click the Commit or Rollback icon (

■ Click Commit to confirm the commit and, if necessary set, or reset, the reverse protection configuration. If the protected site is still up and you can set up reverse protection, you can reconfigure reverse protection by checking the Reverse Protection checkbox and then click the Reverse link. Configuring reverse protection here overwrites any of settings defined when initially configuring the failover.

■ Click Rollback to roll back the operation, removing the virtual machines that were created on the recovery site and rebooting the machines on the protected site. The Rollback dialog is displayed to confirm the rollback.

■ You can also commit or roll back the operation in the TASKS popup dialog in the status bar or under MONITORING > TASKS.

After the virtual machines are up and running and committed in the recovery site, the powered off virtual machines in the protected site are removed from Azure. Finally, data is promoted from the journal to the moved virtual machines.

During promotion of data, you cannot move a host on the moved virtual machines. If the host is rebooted during promotion, make sure that the VRA on the host is running and communicating with the Zerto Virtual Manager before starting up the recovered virtual machines.

Note: If the virtual machines do not power on, the process continues and the virtual machines must be manually powered on. The virtual machines cannot be powered on automatically in a number of situations, such as when there are not enough resources in the resource pool or the required MAC address is part of a reserved range or there is a MAC address conflict or IP conflict, for example, if a clone was previously created with the MAC or IP address.