Disaster recovery deals with the resumption of IT operations after a disruption or disaster. Disaster recovery planning includes various measures to restore IT infrastructures or important data, for example.
What is Disaster Recovery?
Disaster recovery is a part of security planning and deals with the recovery of important IT services and data after a disruption. The disruptions can be caused in different ways such as natural disasters, hardware, and structural failures, operator errors, or hacker attacks. The aim is to minimize the negative impact on a company or organization. Disaster recovery involves restoring servers, networks, telephone systems, or data storage, for example.
The terms disaster recovery and business continuity are often used alternatively. However, business continuity is more comprehensive because business continuity is not just about restoring IT services, but about maintaining critical business operations in general. Thus, the focus of business continuity is more on ensuring business operations.
Technical measures used for disaster recovery are redundancies, provision of replacement hardware, and data backups. They are intended to avoid so-called single points of failure in IT.
The Disaster Recovery Plan (DRP)
The Disaster Recovery Plan, abbreviated as DRP, lists the measures, procedures, and specifications on how to react in the event of a failure in order to minimize the impact on the company.
The measures summarized in the IT emergency plan are described in such a way that they are to be worked through step by step by those responsible. The plan also includes reporting channels, escalation levels, and definitions of responsibilities in the event of a disaster.
The disaster recovery test
A disaster recovery test enables the effectiveness of a DRP to be verified. This makes it possible to ensure that, in the event of an incident, the various measures and procedures of the DRP actually enable IT services to be restored.
The tests must be performed at regular intervals. Results from disaster drills are incorporated into the measures, specifications, and procedures as needed. Periodic testing keeps plans current and trains staff for emergencies and the activities to be performed.
Key metrics related to disaster recovery are Recovery Time Objective (RTO) and Recovery Point Objective (RPO). RTO provides information on how long a system or service may be down. It is the time between the occurrence of the failure and the recovery of the service or system. Time periods can range from a few seconds to several days or weeks.
RPO answers the question of how much data loss is acceptable. It is the amount of time that may elapse between two backups. The lower the RPO, the less data is lost in the event of a failure. As part of the DRP, the values for RPO and RTO must be defined for the various IT services.
Disaster Recovery as a Service (DRaaS) – disaster recovery as a service from the cloud.
DRaaS comprises services from the cloud with which IT services, IT structures, or data can be restored in the event of an emergency or disruption. For this purpose, a provider makes storage space, backup services, virtual IT structures, or virtual servers available as a cloud service.
These enable emergency backups without having to provide additional hardware and software in a separate data center. DRaaS can be useful, for example, for small and medium-sized companies that have little expertise and resources for their own emergency measures and structures.