This paper discusses an approach for creating a good disaster recovery plan for a business enterprise.
The process of preparing a disaster recovery plan begins by identifying these causes and effects, analyzing their likelihood and severity, and ranking them in terms of their business priority. When a disaster strikes, the normal operations of the enterprise are suspended and replaced with operations spelled out in the disaster recovery plan.
The disaster recovery plan does not stop at defining the resources or processes that need to be in place to recover from a disaster.
The second section of this paper explains the methods and procedures involved in the disaster recovery planning process. The first step in planning recovery from unexpected disasters is to identify the threats or risks that can bring about disasters by doing risk analysis covering threats to business continuity. Human caused: These disasters include acts of terrorism, sabotage, virus attacks, operations mistakes, crimes, and so on.
Supplier: These risks are tied to the capacity of suppliers to maintain their level of services in a disaster.
Water: There are certain disaster scenarios where water outages must be considered very seriously, for instance the impact of a water cutoff on computer cooling systems. Fire: Many factors affect the risk of fire, for instance the facility's location, its materials, neighboring businesses and structures, and its distance from fire stations.
Once the disaster risks have been assessed and the decision has been made to cover the most critical risks, the next step is to determine and list the likely effects of each of the disasters. Simple "one cause multiple effects" diagrams (Figure 3) can be used as tools for specifying the effects of each of the disasters. The intention of this exercise is to produce a list of entities affected by failure due to disasters, which need to be addressed by the disaster recovery plan. It may be noticed that two or more disasters may affect the same entities, and it can be determined which entities are affected most often.
Once the list of entities that possibly fail due to various types of disasters is prepared, the next step is to determine what is the downtime tolerance limit for each of the entities.
The cost of downtime is the main key to calculate the investment needed in a disaster recovery plan. How the disaster affected entities depend upon each other is crucial information for preparing the recovery sequence in the disaster recovery plan.
Once the list of affected entities is prepared and each entity's business criticality and failure tendency is assessed, it is time to analyze various recovery methods available for each entity and determine the best suitable recovery method for each. In the case of data systems, for example, the recovery mechanism usually involves having the critical data systems replicated somewhere else in the network and putting them online with the latest backed up data available. Considering multiple options and variations of disaster recovery mechanisms available, it is necessary to carefully evaluate the best suitable recovery mechanism for an affected entity in a particular organization.


The roles, responsibilities, and reporting hierarchy of different committee members should be clearly defined both during normal operations and in the case of a disaster emergency. Note that not all the members of the Disaster Recovery Committee may actively participate in the actual disaster recovery. Quick and precise detection of a disaster event and having an appropriate communication plan are the key for reducing the effects of the incoming emergency; in some cases it may give enough time to allow system personnel to implement actions gracefully, thus reducing the impact of the disaster. The best strategy is to have some kind of disaster recovery plan in place, to return to normal after the disaster has struck. The guidelines are generic in nature, hence they can be applied to any business subsystem within the enterprise. Though both concepts are related to business continuity, high availability is about providing undisrupted continuity of operations whereas disaster recovery involves some amount of downtime, typically measured in days. The causes can be natural or human or mechanical in origin, ranging from events such as a tiny hardware or software component's malfunctioning to universally recognized events such as earthquakes, fire, and flood.
The ultimate results are a formal assessment of risk, a disaster recovery plan that includes all available recovery mechanisms, and a formalized Disaster Recovery Committee that has responsibility for rehearsing, carrying out, and improving the disaster recovery plan.
Figure 1 depicts the cycle of stages that lead through a disaster back to a state of normalcy. The plan should also define how to restore operations to a normal state once the disaster's effects are mitigated. An effective disaster recovery plan plays its role in all stages of the operations as depicted above, and it is continuously improved by disaster recovery mock drills and feedback capture processes.
The effects of a disaster that strikes the entire enterprise are different from the effects of a disaster affecting a specific area, office, or utility within the company. A key factor in evaluating risks associated with telephone systems is to study the telephone architecture and determine if any additional infrastructure is required to mitigate the risk of losing the entire telecommunication service during a disaster.
Operations that have run for a long period of time on obsolete hardware or software are a major risk given the lack of spares or support.
A higher value would mean longer restoration time hence the priority of having a Disaster Recovery mechanism for this risk is higher. In Figure 3, the entities that fail due to the earthquake disaster are office facility, power system, operations staff, data systems, and telephone system.
This information becomes crucial for preparing the recovery sequence in the disaster recovery plan. For example, having the data systems restored has a dependency on the restoration of power. For less critical data systems, there may be an option to have spare server hardware, and if required these servers could be configured with the required application. This committee should have representation from all the different company agencies with a role in the disaster recovery process, typically management, finance, IT (multiple technology leads), electrical department, security department, human resources, vendor management, and so on.


During a disaster, this committee ensures that there is proper coordination between different agencies and that the recovery processes are executed successfully and in proper sequence.
Execution Phase: In this phase, the actual procedures to recover each of the disaster affected entities are executed. A hurricane affecting a specific geographic area, or a virus spread expected on a certain date are examples of disasters with advance notice. At the end of this phase, recovery staff will be ready to execute contingency actions to restore system functions on a temporary basis. For an enterprise, a disaster means abrupt disruption of all or part of its business operations, which may directly result in revenue loss. Effects of disasters range from small interruptions to total business shutdown for days or months, even fatal damage to the business. The disaster recovery system cannot replace the normal working system forever, but only supports it for a short period of time. Finally, ongoing procedures for testing and improving the effectiveness of the disaster recovery system are part of a good disaster recovery plan. And the fourth section explains what information the disaster recovery plan should contain and how to maintain the disaster recovery plan.
For example, spilling several gallons of toxic liquid across an assembly line area during working hours is a different situation than the same spill at night or during the weekend. To mitigate the risk of disruption of business operations, a recovery solution should involve disaster recovery facilities in a location away from the affected area.
The entities with less downtime tolerance limit should be assigned higher priorities for recovery. It should have trusted information sources in the different agencies to forestall false alarms or overreactions to hoaxes. To minimize disaster losses, it is very important to have a good disaster recovery plan for every business subsystem and operation within an enterprise.
At the earliest possible time, the disaster recovery process must be decommissioned and the business should return to normalcy. Nowadays most of the meteorological threats can be forecasted, hence the chances to mitigate effects of some natural disasters are considerable.
After the disaster detection, a notification should be sent to the damage assessment team, so that they can assess the real damage occurred and implement subsequent actions. Nevertheless is important to consider documenting the scope of these natural risks in as much detail as possible.



New american films download
Creating an emergency action plan
Data center disaster recovery plan


Comments

  1. 31.10.2014 at 16:27:57


    And it was of higher therefore, particulars of suitable and sources for.

    Author: LOVE_SEVGI
  2. 31.10.2014 at 22:51:55


    Behind an unpleasant jim writing a stupid blog.

    Author: Smert_Nik