The key to effective recovery is being able meet the needs and expectations of the user community.
Anyone who has been involved in a disaster recovery after a catastrophic failure of their company's IT infrastructure will understand some of the hard facts of life.
First fact of life is that any user, manager or executive who is unable to execute critical business failures is also likely to have a very poor sense of humour. When factories grind to halt, when warehouses are unable to dispatch goods or when invoicing and cash collection are just not possible, then all understanding of your difficult recovery task is absent.
Second fact of life is that the tangled web of servers, network connections which normally function pretty well, but take a long time to reconstruct is not in the interests of rapid recovery. Stressed accountants desperate to meet their own deadlines will not easily accept delays that are normally acceptable.
During a recovery, the astute IT professional will need to have all of his water-borne poultry carefully aligned (techno-speak for having your ducks in a row). The most effective way of doing this (sorry no easy ways) is to ensure that an impact analysis has been carried out. The impact analysis will give the IT professional a firm business-based target for recovery planning.
Recovery planning can now be undertaken to meet the following needs:
Impact - plan to recover applications, services and business process that have the greatest impact, first. Then sequence all facilities in declining order of impact value.
Agree a cut off point, with management, beyond which no recovery occurs, or beyond which time criticality is not a material issue. This is very important in determining the cost and effort required for recovery.
Minimise the recovery requirement. The less you have to recover the higher your chances of success.
Minimise the recovery requirement. The less you have to recover the faster you can recover.
Minimise the recovery requirement. The less you have to recover the lower the cost of providing recovery resources. These resources include disaster recovery contracts, spare servers standby staff and other resources.
Estimate the time required to recover facilities and build a time line. Identify those facilities that can not be recovered in the target times needed by business in the impact analysis mentioned earlier. These facilities should covered by high-availability solutions, such as hot or warm standby systems. The impact analysis should also justify the costs for providing such solutions. Typically it is impossible to relocate to a new environment and rebuild servers inside 12 to 16 hours.
Simplify your production environment. Do not have two file servers when one will do. Two servers are more than twice as difficult to recover as one.
Execute tests to determine how long it actually takes to re-build your infrastructure. Carry out these tests regularly and refine your recovery processes to meet target timeframes. Use the results to adjust your production environment. Simple changes in the way you do your back ups will dramatically affect recovery time and effort.
Remember that in the end it is however more important to recover all required functions, than to recover some functions quickly and fail on others. Effectiveness during recovery is a higher priority than speed and cost.
In summary, focus attention on items specified as most critical in the IT continuity plan to build in resilience and establish priorities in recovery situations. Avoid the distraction of recovering less critical items and ensure response and recovery in line with prioritised business needs, while ensuring that costs are kept at an acceptable level and complying with regulatory and contractual requirements. Consider resilience, response and recovery requirements for different tiers, eg, one to four hours, four to 24 hours, more than 24 hours and critical business operational periods.