Part One: Foundation of Resilience: Why Leadership Must Drive Disaster Preparedness
Over the last couple of years I’ve talked and written about the importance of leadership, specifically non-technical stakeholders, having a central role in developing a data strategy. Too often, organizations leave their data strategy in the hands of technical teams, leading to robust and comprehensive data solutions that do not meet the needs of the stakeholders. Well, there is another area of technology where non-technical leaders/stakeholders need to be as central in the decision-making process: disaster recovery and business continuity.
Disaster recovery and business continuity seem like old-school topics that have been solved ages ago, yet sadly, the number of organizations that don’t have a reliable plan is more than most of us would want to admit. Here are some statistics that should give any executive the chills:
- According to a 2021 study by Computing Research, only 54% of organizations have documented disaster recovery plans in place.
- Of the organizations that do have a disaster recovery plan, a shocking 7% never actually test or validate their plans. Additionally, half of them only test their plans once a year or less frequently.
- A study by IBM found that 77% of organizations don’t have a cyber security incident response plan that applies to the entire organization.
- Research by Gartner revealed that only 35% of small and midsize businesses have a comprehensive business continuity plan in place.
Often, when an executive from the C-suite is asked if their organization has a DR/BC plan, the response is “Yes.” But when you ask them about their Recovery Time Objective (RTO) or Recovery Point Objective (RPO), they have no idea.
I cannot stress enough how important it is for leadership, specifically the C-suite, to actively engage in knowing and understanding their organization’s disaster recovery and business continuity plans, and critically, their RPOs and RTOs.
C-suite executives, regardless of their technical expertise, must understand their organization’s Recovery Time Objective (RTO) and Recovery Point Objective (RPO) to make informed strategic decisions, mitigate risks, ensure compliance, and effectively collaborate with IT teams. By actively engaging in comprehending these critical metrics, executives demonstrate leadership and commitment to safeguarding their organization’s operations, reputation, and financial stability in the face of potential disasters.
With some context around the problem, lets do a quick review of the concepts and terminology.
Quick Review
Now, it is important to note that disaster recovery and business continuity are distinct concepts, but they are closely intertwined. Business continuity encompasses a broader scope, including strategies to keep operations running during and after a disruptive event, while disaster recovery focuses on restoring IT systems and data.
In the context of DR, the Recovery Point Objective (RPO) is a key metric in disaster recovery planning that defines the maximum amount of data loss an organization can tolerate in the event of a disruptive incident or disaster. And Recovery Time Objective (RTO) is the maximum acceptable amount of time for restoring a network, application, or system and regaining access to data after an unplanned disruption or disaster event.
So what is the role of the c-suite executive in DR/BC planning?
It is the responsibility of the C-suite to provide clear guidance and direction to the technical teams tasked with implementing the DR strategy. This guidance should focus on three critical areas. First, executive leadership must identify and prioritize the business-critical systems that require zero or near-zero downtime. These are the systems that are essential to the organization’s ability to function and generate revenue. Second, leadership must define the acceptable amount of data loss during an outage, known as the Recovery Point Objective (RPO). This metric helps technical teams determine the frequency and scope of data backups. Finally, executive leadership must determine the acceptable amount of downtime before restoring critical systems, known as the Recovery Time Objective (RTO). This metric guides the development of strategies to quickly restore systems and minimize the impact of an outage. By providing clear parameters around these three key areas, executive leadership empowers technical teams to prioritize systems and develop appropriate DR strategies that align with the organization’s overall business objectives.
Now, equipped with the engagement and guidance of the leadership team around the priority of critical systems,RPOs and RTOs, the technical team can develop a robust DR strategy that will inform the BC strategy. And a case for the value of the cloud can be made effectively.
In part two I will discuss the value proposition of a cloud-based DR solution using services from Amazon Web Services (AWS). And provide some ideas around chaos testing for continuous improvements, and continuous monitoring and testing.
From The News
Kronos Private Cloud Outage (2021)
In December 2021, workforce management company Kronos suffered a crippling ransomware attack that took down its private cloud services used by thousands of businesses globally.
- The outage disrupted payroll, scheduling and other workforce management operations for numerous companies across multiple industries.
- Some major customers like the City of Cleveland and the University of Florida had to revert to manual timekeeping processes.
- Kronos was unable to restore services from backups for weeks due to the severity of the attack.
The incident highlighted the company’s lack of resilient DR capabilities and contingency planning for such a scenario.
Previously published on LinkedIn