Disaster Recovery (DR) and business continuity plan (BCP) has become a regulatory requriment in Oil & Gas, Banking & Finance and Energy sectors. It is a good business practice for any sector. I prefer to call it as DR and BC framework. Since the term DR & BCP has been widely accepted, I will loosely use it and interchange with DR & BC framework in my thoughts.
What is not DR & BCP ?
DR & BCP is not about a plan on how to restore a business operation when a disk array failed in a data center or server reboot failed. That is an outage and incident management process shall be followed, but not a disaster recovery plan.
What is DR & BCP ?
DR & BCP is a framework for speedy restoration of business operation with most minimum impact in case of an unforeseen disastrous circumstance.
What is the context for DR & BCP ?
Priority in life quickly changes when circumstance drastically change. It is true if we look back and see the priority we had in last 5, 10, 15 and 20 years. Priority in life changes even when the circumstances changes in its own phase. It changes exponentially fast when the circumstance drastically change.
When drastic changes occurred due to natural disaster in Haiti, New Zealand and recently in Japan; observe the priority change in the people’s life during and immediately afterwards. DR & BCP is a framework to follow to restore a business operation immediately after those kind of disaster in global level or in local level specific to the business operation.
What are the components of DR & BC framework?
- Business architecture: Blue print to illustrate functional aspect of core business
- Technical architecture: Blue print on how technology used to service the core business
- Building architecture: Various location of business operation, interconnection of various locations, complete blue print of each location including electrical circuits, back up generator, drainages, sewage and etc
- Risk Management Plan: Risk assessment, risk rating, approach
- Location: Location of the current operation
- Various other factors..
- Restoration plan/procedure: Various scenario planning on restoring the business; sequence of operations, mechanics, process and etc
- People: Choosen team members who will be running the restoration plan/procedure
As DR & BCP more emphasizes on restoration plan/procedure component and less or no emphasis on the other components laid out. For Banking industry, there is a complete IT booklet available from Federal Financial Instituions Examination Council on BCP. There are lots of great ideas from the template/booklet. However, when I pictured to restore the business operation during/after the horrific event, my thought process was looking for the above listed components are the required to restore the business operation. Booklet helps to achieve the certification from the examiner but I feel the team formation and location are also key pieces in the framework which are missing in the template. In general, already DR&BCP means the restoration plan/procedure. Restoration plan/procedure is all about techniques and mechanics to restore the business operations. Rest of the components are self explanatory and business dependent. I will amplify the importance of people and location below.
People: When a disaster occurs at key locations of business operation, the business operation team need a plan to quickly recover from the disaster; mentally, emotionally, physically and work on the restoring the business operation. DR & BC plan is a framework to assist to restore the business operation when such circumstance is presented to us by mother nature or by any other means as a surprise. DR & BCP requires more emphasis on identifying a team who have mental toughness in a very difficult situation who can be calm and can direct the rest of the team to restore the business operation.
We all will remember for ever Mr. Chesly Sullenberger for his heroic display on Hudson river how he saved hundreds of lives. We were all honored by having Mr. Sullenberger at 2009 super bowl opening ceremony. Recently in the New Zealand earth quake, the Christchurch mayor Mr. Bob Parker is another example for a best crisis manager. Mr. J. Radhakrishna who worked for recovery after 2004 tsunami was internationally recognized for his recovery measures and former President Clinton personally met him and asked him to share his expertise in Washington.
With or with out a DR & BCP, these heroes were calm and directed the rest for a speedy restoration of the business operation with minimum impact. Just having a plan and not having personality like them to execute the DR & BCP will diminish the success rate of DR & BCP.
Who is Mr. Sullenberger or Mr Parker or Mr. J Radhakrishna in your DR & BCP plan?
Location: It is expected to select the locations of key business operation with low probability of natural disaster. It is not part of execution of DR&BCP plan but it is part of the risk assessment of the DR & BCP. Probability of disaster for key locations are expected to very low but the assessment made during the risk analysis of DR&BCP reveals it is higher, an immediate action on business cost benefit analysis to lower the risk is mandated. In most cases, the location of key business operation like data center, power generation unit, etc are selected in past; 40-60-80 years ago and there was not a lot of thought put forward to select the location. Generally the business case to move the location to safer location is not justifiable and hence there are dual location to offset the risk. However, there are scenarios where both dual location or more than one location have same or higher probability of disaster. In some scenarios, all the location may have the disaster at the same time frame. Those are high risk area for the business operation. In the financial sector, these losses have economical impact but where as in energy, oil & gas sector, the impacts are economical and life destructive. These kind of guideliness must be mandated as part of the regulatory requirements for key sectors like Oil & Gas and Energy.
Ideally, probability of disaster of key business operation location must be a factor while evaluating a service provider or hosting provider for key/mission critical business operation. The location determines the probability of a natural disaster. Earth quake is one of the major natural disaster with significant impact and tsunami (it is a side effert of earth quake happens in the ocean), hurricane, storm, heavy rain fall and etc. Recent earth quake at New Zealand and most recent in Japan proves that “ring of fire” has very high probability for massive earth quake. It is a key indicator that key business operation like data center should not be in this region. Even if there is any current business operation like data center in this region,as stated above, consider the business cost benefit analysis to migrate the location to a safer location. My recommendation is to hire a geoloist for a week to pick the ideal solution with low probability. Ring of fire has high probability of massive earth quake. Where as there are other areas which has high probability of earth quake (not massive). Even a mild to medium earth quake in a data center will have a major business impact. See the chart in US. All the red spot has veryhigh probability of massive earth quake. Where as green and yellow shaded area also has high probability for a mild -medium earth quake impact. All the white spot in the map has very low probability for very mild earth quake. There are similar charts available for other natural disaster like hurricane,flood and etc. Over lay those maps together and you will see very few places like Michigan which is one of the safest place in US to host data centers. Also Michigan has a big pool of highly talented labor work force to manage business operation like data center comparing to other safer places in US.
Note: Thank you to Babu Jon for providing constructive feedback to improve the quality of the document.