How to Protect Your Business From a Disaster Scenario
If you read my previous blog post
, you’ll understand the importance of having a Business Continuity Plan in place (BCP) to ensure that your organisation is protected in the event of a disaster. As mentioned in this post, Business Continuity
and Disaster Recovery (BCDR) is a broad term used to describe an organisations preparation for unforeseen risks to their continued operations.
In this blog post, I'll outline the solutions that are common to feature in a BCP and how they can ensure that your organisation can continue to run with minimal disruption in the event of a disaster. Most solutions I will mention in this post for BCDR are IT related and range from a simple backup, through to high availability involving real time data replication and minimal (if any) data loss in the event of invocation.
In order to determine the most appropriate solution against the business impact and risk assessment, any business needs to understand the following for all services to be protected:
- Recovery Time Objective (RTO); this is a representation of the amount of time elapsed before the business systems should be back online
- Recovery Point Objective (RPO); this metric establishes the amount of data loss the business can afford
Using these measurements will guide a business in to which disaster recovery (DR) options are available to them. RTO and RPO should be used first before using another requirement like budget or what IT equipment is in place. because BCDR is a business requirement and therefore should be driven by the services used by the business and their criticality. An example could be a company’s email system. Most companies have it and don’t want to spend too much money on it but when looked at from the perspective of the business, email is a critical asset. It is often used as the primary method to contact customers and staff and it also contains data that cam't be found on no other system in the business.
Once the business has identified all its systems and confirmed the recovery requirements i.e. RTO and RPO, then DR solutions and options can be looked into and confirmed. As with many things, there is not one-fix and it will depend on the system, what the business requires, the current IT in place, budgets, dependencies and other items that determine the eventual DR solutions. Below is a list of some of the most common IT DR solutions:
A common element to the BCP includes ensuring that a suitable, equipped workspace is available to staff in the event that a key location becomes unavailable. A number of businesses provide recovery workspace facilities in the event of a site disaster, normally on a per-desk annual payment. These workspaces are fitted out with the usual conveniences including desks, chairs, telephones, computers, photocopiers, printers, network connectivity and support with setting it all up to your company's requirements.
This is the most common type of DR solution and is typically provided across all systems and services, especially those that store data. Some people may not regard backups as DR as they only account for the RPO, and without a system to restore to, cannot confirm the RTO. However, it is an integral part of any BCP.
The number of backup options is large and probably has as many as all the other DR solutions put together. Nevertheless, all backups have the same main requirement that is the protection of data at a point in time. It could be stored on the same system, to another local system, remotely to another site, or a combination of all of these. It could be to different storage media taken at various times and stored for different lengths of time (retention).
Common backup solutions include things like tape media, snapshotting, cloud backups and version controls, so there is a need to understand all the business and IT requirements before designing an appropriate backup solution.
Cold standby is a BCDR solution that goes hand-in-hand with a backup solution. It provides a bare-metal server or blank system to restore a company’s data. Cold standby solutions can be provided in a number of ways from shipping hardware to site, to attending a data centre which provides temporary allocated hardware to bringing up new systems in a cloud provider’s estate.
Cold standby solutions normally provide the longest RTO due to the fact that a business has to create and configure all of its systems before any data can be restored to them. This solution is normally selected because of budget constraints or because the services in question are not as critical to the business. It is the cheapest of DR solutions.
Businesses should carefully consider the use of cold standby before selecting it. Although it may seem the cheapest or sometimes even at no cost, it can mean that a business may not be able to access its data or services for hours or even days. In a disaster, this can have dramatic effects on how the business operates and how it is perceived by its customers.
This is very much the next step on from a standard backup. It consists of continually or frequently, sending changes from a source system to a remote system, which are reflected immediately so that they both contain the same data. This means that in the event of a disaster a company loses little or no data.
Continuous Replication can be combined with backups to provide point-in-time sets via pausing and ‘snapshotting’ the data, but its main use is to keep RPO as little as possible. Continuous replication can also reduce the RTO but it can take just as long as a backup because only the company data and not the system or configuration is replicated. In this instance, a cold standby solution would be used to bring the system back up.
Continuous Replication is used for customers who wish to protect their most critical data but don’t mind when they get access to it. This can be useful for things that are often written but not always viewed eg CCTV video footage or audit logs.
High Availability is the next step beyond Continuous Replication and can sometimes include the terms like warm or hot standby. High Availability or HA not only keeps RPO at a minimum but is also aimed at reducing RTO. It does this by not only replicating the data, but normally the entire system including configuration data like users, authentication, operating systems and other such non-business created data.
HA is typically the most expensive form of DR a business will take on due to the fact that secondary equipment is already running or ready to run upon a disaster scenario being invoked. HA has to be carefully thought of so that other systems and users can work with the HA system without effecting the business, but also restoring back to the primary system when it becomes available.
HA is used for the most critical services within a business that if failed, would cause the business major disruption.
Fault Tolerance is very similar to HA but is where two or more systems work in parallel performing exactly the same actions. Fault tolerance allows a service to continue without disruption even if a system fails. This is not normally considered DR and is used only in rare circumstances where a service is key to a business, possibly web or financial transactions. Beyond this, companies that require services and data to be available all the time, such as payment card systems and banks, specially develop software and systems so that data is written on more than one system at different sites before being committed.
For more information on BCDR and the solutions recommended for your BCP, visit our webpage. You can also contact us on 01992 807 44 or email@example.com.