The Failover Live Process
Use the Failover operation following a disaster to recover protected virtual machines to the recovery site.
| Note: | You can also move virtual machines from the protected site to the recovery site in a planned migration. For details, see Migrating a VPG to the Recovery Site. |
When you set up a failover you always specify a checkpoint to which you want to recover the virtual machines. When you select a checkpoint – either the last automatically generated checkpoint, an earlier checkpoint, or a tagged checkpoint – Zerto makes sure that the virtual machines at the remote site are recovered to this specified point-in-time. By setting a commit policy that enables checking the recovered machines before committing the failover, you can check the integrity of the recovered machines. If the machines are OK, you can commit the failover. Otherwise, you can roll back the operation and then repeat the procedure using a different checkpoint.
The Failover operation has the following basic steps:
| • | If the protected site or Zerto Virtual Manager is down, the process continues with the next step. |
If the protected site or Zerto Virtual Manager is still running, the failover requirements are determined:
| • | If the default is requested, doing nothing to the protected virtual machines, the Failover operation continues with the next step. |
| • | If shutting down the protected virtual machines is requested and the protected virtual machines do not have Microsoft Integration Services available, the Failover operation fails. |
| • | If forcibly shutting down the protected virtual machines is requested, the protected virtual machines are shut down and the Failover operation continues. |
Creating the virtual machines at the remote site in the production network and attaching each virtual machine to its relevant virtual disks, configured to the checkpoint specified for the recovery. The virtual machines are created without CD-ROM or DVD drives, even if the protected virtual machines had CD-ROM or DVD drives. Also, the operation is considered successful, even if some of the virtual machines in a VPG fail to be created on the recovery site or are created without their complete settings, for example re-IP cannot be performed.
| Note: | If the virtual machines fail to be created on the recovery site in Public Cloud, the failover operation will not succeed. |
The original protected virtual machines are not touched since the assumption is that the original protected site is down.
VHDX disks are always recovered in the recovery site with dynamic disks. VHD disks are recovered in the recovery site by default with the same configuration as in the protected site.
| • | Preventing automatically moving virtual machines to other hosts: Setting failover clustering to prevent Dynamic Optimization. This prevents automatic live migration of the affected virtual machines during the Failover operation. |
| • | Powering on the virtual machines making them available to the user. If applicable, the boot order defined in the VPG settings is used to power on the machines. |
| • | If the protected site is still available, for example, after a partial disaster, and reverse protection is possible and specified for the Failover operation, the protected virtual machines are powered off and removed from the inventory. The virtual disks used by the virtual machines in the protected site are used for the reverse protection. A Delta Sync is performed to make sure that the two copies, the new target site disks and the original site disks, are consistent. A Delta Sync is required since the recovered machines can be updated while data is being promoted. |
If reverse protection is selected, and the virtual machines are already protected in other VPGs, continuing with the operation will cause the virtual machines to be deleted from other VPGs that are protecting them and to the journals of these VPGs to be reset. If no other virtual machines are left to protect, the entire VPG will be removed
Protecting virtual machines or a vCD vApp in several VPGs is enabled only if both the protected site and the recovery site, as well as the VRAs installed on these sites, are of version 5.0 and higher.
| Note: | If reverse protection is not possible, the original protected site virtual machines are not powered off and removed. |
Failback After the Original Site is Operational
To fail back to the original protected site, the VPG that is now protecting the virtual machines on the recovery site has to be configured and then a Delta Sync is performed with the disks in the original protected site. Once the VPG is in a protecting state the virtual machines can be moved back to the original protected site, as described in Migrating a VPG to the Recovery Site, on page 225.