Curriculum
Recovery sites are predefined locations where an organization can relocate its critical operations in the event of a major disruption or disaster. The choice of site depends on the organization’s budget, recovery time objectives (RTOs), and recovery point objectives (RPOs):
These are fully operational data centres equipped with all necessary hardware and software, mirroring the main operational environment. They provide immediate failover capability, minimizing downtime. Hot sites are the most expensive option but offer the fastest recovery time.
These facilities are partially equipped with network connections and servers, but typically do not host live data or up-to-date applications. In the event of a disaster, a warm site requires some time to become operational, as data and software need to be loaded to reach full functionality. Warm sites represent a middle ground in terms of cost and speed of recovery.
Cold sites provide only the physical space for recovery operations; they do not include pre-installed hardware or software. Organizations using a cold site will need to install and configure equipment and restore data from backups, which can significantly lengthen the recovery time. Cold sites are the least expensive option but require the most effort to set up following a disaster.
Building resilience into infrastructure is about ensuring that IT systems, networks, and services can withstand and recover from failures or disruptions:
Implementing redundancy involves creating duplicates of critical components or systems, such as servers, networks, and data storage, to ensure that backup options are available in the case of failure. This can include multiple power sources, network paths, and data centres.
Failover refers to the automatic switching to a redundant or standby system upon the failure or abnormal termination of the previously active application, server, or system. This process helps maintain service continuity without human intervention.
Infrastructure should be designed to handle varying loads and be capable of scaling up or down as needed. Scalability ensures that the infrastructure can support business growth and adapt to changes without requiring a complete redesign.
Regularly testing the resilience of infrastructure is crucial to ensure that systems will perform as expected during a disruption. This involves simulating failures and practising recovery procedures to identify and address any weaknesses.

Figure-7
Not a member yet? Register now
Are you a member? Login now