BC/DR Strategies within a Cloud Environment
Traditional business continuity and disaster recovery methods require redundant hardware and a co-location or remote facility for data storage. This results in costly solutions that are cumbersome to implement and maintain. As more organizations migrate to the cloud in pursuit of benefits such as rapid elasticity, on-demand self-service, increased collaboration and ubiquity, many are also beginning to recognize opportunities within the cloud to facilitate business continuity/disaster recovery (BC/DR) efforts. Does this mean once you are in the cloud you have built-in disaster recovery capabilities? Absolutely not! One would be remiss to consider the cloud and business continuity/disaster recovery as one in the same. Organizations must still develop effective cloud-based BC/DR strategies and detailed plans to ensure business continuity and disaster recovery in the event of a disruption.
Before developing a detailed plan for a cloud-based BC/DR strategy, it is important to understand the cloud environment. Three common scenarios to consider are highlighted below:
1. On-premise Cloud as BC/DR Plan:
In this instance, the enterprise infrastructure that facilitates everyday business functions exists on-premises (on-prem), while a cloud provider delivers alternative facilities in the event of a disaster occurring on the on-prem enterprise infrastructure.
In this scenario, organizations often identify and select new cloud vendors. Thus, they must gain a firm understanding of the functional and resource capabilities necessary for an efficient recovery in the event of a disaster. After a new cloud provider has been selected, a careful review of the service level agreement (SLA) is necessary to ensure that all services, functionality and business requirements are adequately provided for, and any assumptions validated.
One must also consider the conversion of workloads on physical machines into virtual machines and how quickly resources can be made available when needed.
2. Cloud User, Primary BC/DR Cloud Provider:
In this scenario, the enterprise infrastructure is already located in the cloud. The risk of failure of any part of the infrastructure, such as a regional failure, is mitigated. The business continuity strategy focuses on service provisioning, restoration or failover to another part of the same cloud provider infrastructure.
While in this scenario, the focus is placed heavily on resources and capabilities of the existing cloud provider, one must carefully evaluate the provider’s capabilities, as the BC/DR strategy may require new resources and functionality, such as load-balancing performance and bandwidth availability between the redundant facilities of the cloud provider.
3. Cloud User, Alternative BC/DR Cloud Provider:
This scenario is similar to Scenario 2, except in this instance, service restoration is provided by a separate cloud provider. Thus, the risk of complete cloud provider failure is mitigated.
In the case that an organization selects a second cloud provider for service restoration, the speed of the move to the alternate cloud provider must be heavily evaluated. It is also likely that business users will feel the impact, as primary and secondary cloud provider functionalities may differ greatly. Thus, it is worthwhile to involve the business users as soon as possible, to assess the residual risks to their business.
In all cases and for all scenarios, the proper assessment and enumeration of risk is critical to designing an adequate BC/DR strategy and making balanced business decisions around risks.
Benefits of Cloud-Based BC/DR
An effective cloud-based BC/DR strategy enables the business enterprise to quickly back up their critical data, applications, and operating systems, based on business requirements. The cloud may be a strong BC/DR option for the following reasons:
• Virtualization provides faster provisioning and a smaller hardware footprint for the organization
• Broad network connectivity leads to faster uploads and downloads of important computing elements, which naturally translates to faster recovery times for the business
• The cloud is hardware independent and reduces vendor lock-in
• Cloud-provider infrastructure is usually more resilient than the alternative. It has the ability to burst resources on demand, thereby making it more flexible, scalable, and agile. This provides immediate access to all of the compute capacity, memory and processors that are needed, when they’re needed, and, for as long as they’re needed
• Pay-per-use can mean the total cost of cloud-based BC/DR may be significantly lower than other alternatives
A Cloud-based solution may offer a cost-effective way to maintain high availability and reliability for business applications, especially if they support mobile workers, telecommuters or field-based workforce.
Business Requirements for Cloud-Based BC/DR Strategy
When developing a cloud-based strategy for BC/DR, there are general concerns and business requirements specific to BC/DR and cloud, including legal and regulatory considerations, loss of governance, supply chain dependencies and location risks. Each of these concerns should be addressed when developing a comprehensive strategy. The following items should be included in a BC/DR plan:
• Location and contact information of the business workforce, and business partners including 3rd party providers
• Important assets that need to be protected and may need to be restored
• Current location of these assets
• Network connectivity between the assets and sites of their processing
• Data and functionality replication
• Failover capability
• Security and access considerations
In addition to the above factors, key pertinent questions need to be answered before an effective strategy can be developed. Some of these include:
• What is the required recovery point objective (RPO), i.e. what data loss would be tolerable?
• Is the data sufficiently valuable for a cloud-based BC/DR strategy?
• What is the required recovery time objective (RTO), i.e. what amount of time is tolerable for information to be unavailable?
• What dependencies exist with third-party providers, e.g. data, processing, infrastructure, etc.
• What kind of disaster scenarios and business impacts are included in the risk analysis? Does it include cloud provider failure?
Conducting an analysis of relevant risks will help in the development of proper business requirements for a cloud-based BC/DR strategy.
Creation & Implementation of a BC/DR Plan
A fully tested BC/DR plan that is ready for failover event should have a similar structure to the organization’s IT implementation plan. It is highly recommended that an organization adapt the plan to existing IT project planning and risk management methodologies. The plan could also be embedded in an information security strategy, which encompasses clearly defined roles, risk assessment, classification, policy, awareness and training. It also makes sense to consider BC/DR as an intrinsic part of IT service that is regularly revoked, if only for testing purposes.
Business requirements should be analyzed for completeness and consistency. This includes the identification of all dependencies for processes, applications, business partners and third-party service providers. The purpose of the analysis is to translate the BC/DR requirements into important inputs that will be used in the design, such as technical components and underlying services of in-house operating functions that may need to be replicated in the disaster recovery environment. Analysis may also identify opportunities for decoupling systems and services and identify any common failure modes. The design phase should list out technical alternatives and include procedures and workflows for these alternatives.
Care should be taken to ensure that the required infrastructure and services are made available and the disaster recovery platform track any relevant changes and functional updates that are made to the primary platform. Additionally, it is advisable to include all DR-related infrastructure and services in the regular IT services management plan.
A formal acceptance test will bring the BC/DR into production mode. This requires integration with all regular IT service processes. Once in production mode, controls should be in place to ensure that it will keep working.
The BC/DR plan should be tested at planned intervals or upon significant organizational or environmental changes. An untested failover is unlikely to end well. Ideally, a full-scale exercise will realize a full switch over to the DR platform. At the same time, this test should not pose a threat to the production user population. Risks that can manifest should be simulated and responded to in the most realistic ways possible.
Olay Ladeji, CISSP, CISM, CCSP, is a management consultant for Enaxis Consulting, LP. With more than 17 years of broad range international experience, Ladeji has a wealth of knowledge in information security, IT service management, cloud security, program/project management, enterprise risk management, IT governance, business analysis, business continuity management, and disaster recovery.