Before embarking on a crusade to build a framework to support IT continuity, it would be appropriate to first describe what it is and to justify the need for such a framework.
An IT continuity framework is a set of published principles and policies developed by IT to agree and implement policies related to IT continuity. This framework should ideally form part of a CIO's IT Governance Framework. In order for such a framework to be effective, it should be aligned to business needs and objectives. This may present the CIO with certain difficulties. Many businesses, even large ones, do not have published business strategies, even fewer have documented strategies which are up to date and agreed by all business units. Nevertheless it is essential that a CIO ensures that he has an IT environment which can survive events which will disrupt IT operations.
In many instances IT organisations are doing sterling work in making provision for IT continuity and IT disaster recovery in organisations where there is no provision for business continuity or even much awareness of the problem. (Business continuity is the overall provision of measures aimed at ensuring that the rest of the business outside of IT can continue to function despite disruptions and disasters.) Even when business neglects its responsibilities it is still the duty of IT to ensure an appropriate disaster response capability.
To meet the organisation's requirements, a framework for IT continuity should meet the following general specifications:
* A framework for IT continuity should support enterprise-wide business continuity management with a consistent process (assuming that such an initiative exists).
* The framework should assist in determining the required resilience of the infrastructure.
* The framework should drive the development, testing and execution of disaster recovery and IT contingency plans.
* The framework should address the organisational structure for continuity management defining the roles and responsibilities of internal staff and external service providers.
* The framework should define rules and structures to document, test and execute the disaster recovery and IT contingency plans.
* The framework should also identify resources required to keep critical IT functions running or restore critical functions after a disaster event.
* The framework should define principles of back-up, recoverability resiliency and when needed, redundancy in IT processing and support resources.
* Most importantly the framework should be based upon business risk and resulting impact. This is crucial to the overall justifiability and effectiveness of the IT continuity effort.
* The framework should also take into account the following factors:
- Regulatory requirements (for instance Financial Services Board requirements).
- IT best practices and standards (such as CobiT, ITIL, UK Business Continuity Institute, etc).
- Legal requirements (record retention, Companies Act, etc).
- Internal and external audit demands.
This framework will define the methods of operation for an organisation's IT Continuity efforts at a high level. It should be the primary source of all policies covering IT continuity.
How to justify and direct the framework
I regularly meet IT managers and executives who wrestle with the same set of problems. These problems are related to two crucial issues. Firstly, the question of justification, particularly cost justification, and secondly, the nature and extent of the provisions which should be provided.
The answers lie in the commonsense discipline of risk management. Most answers to these vexing questions lie in determining what the risks are of business processes failing or being disrupted. The path to easy justification lies in assessing the risk and thereafter using the resultant impacts (cost of failure) to quantify these in a manner which can be presented to management. Generally business management does not understand (nor cares about) the reasons behind systems failure and the time and effort required to restore IT service. The one aspect that management universally understands is the cost of or penalty associated with service failure and data loss. A carefully constructed case based on the cost of downtime to business will justify expenditure on the protection of such services and data.
Where to put our IT service continuity efforts
It is not practical to simply assume that all IT functions should have the same service continuity efforts. It would, in any case, result in an overkill solution. Efforts should be prioritised so that the continuity and recovery efforts meet business and risk criteria.
It is a sound principle of business continuity management (BCM) to differentiate systems and processes during recovery. Some systems will require that you provide highly available solutions which provide the capability to continue processing despite hardware, telecommunications or other failures. Some systems will be able to tolerate short periods of downtime. Some systems will permit long periods of downtime, but may reach thresholds beyond downtime may not be tolerable, or be required to function at certain critical times (payroll could be a good example).
All of these factors are determined during risk assessment and impact analysis. They should always form the basis of your response capability.
A general rule is also that recovery solutions are not as costly as highly available solutions. Make your responses to the business needs cost appropriate, as defined by risk and impact. Try to avoid a complex set of solutions based upon technical considerations. Make all your efforts match the business need to avoid or deal with disaster.
No, I am sorry to say, the provision of continuous IT service is a complex subject and even more difficult to implement properly. The construction of a framework is merely part of the process, but nonetheless if carefully constructed, diligently maintained and consistently used to define and direct efforts will ensure that IT and the enterprise have a good basis for agreement and understanding.