Responding to Fundamental Flaws in Ineffective DR Measures
An exhaustive risk identification strategy that guarantees business recovery during a system outage or malfunction
Inability to recover from a business disruption implies an impact on revenue. System interoperability and lack of resources for hours or days on end can amount to serious consequences for a company’s financials and brand perception. Analyzing the foundational flaws in recovery measures is essential to understand why they don’t succeed.
Root causes for business recovery failures can be resolved through decisive solutions that facilitate operational resiliency
Consolidated Strategy | Change Management | Validation |
BCDR solutions must take a holistic view of the threat landscape. Vulnerabilities must be covered exhaustively and in detail. | BCDR solutions, business processes and recovery measures must be synchronized. | Testing clarifies if a particular strategy works. Exercises drills evaluate if they are feasible. Any recovery measure must meet both these requirements in order to ensure effectiveness. |
Emergency teams might encounter some resistance initiating an open discussion across all departments on past failed recovery measures. However, the activity is crucial to identify inherent flaws in the recovery framework. When system interoperability and unavailability extend beyond the recovery point and recovery time objectives (RPO and RTO respectively), IT teams must undertake an in depth analysis of business systems to identify foundational flaws that render recovery measures ineffective.
Strategy Consolidation
While planning is an integral component of any Business Continuity and Disaster Recovery solution, paying attention to the last detail can ensure that recovery measures enjoy a higher success rate.
Organizations often overlook BCDR solutions for a variety of reasons such as:
- There is no pressing need to formulate a BCDR solution
- They can make do with
- Insurance Coverage
- Government Assistance and
- Their Own in-house skills
- Time Constraints and Complexity
- Lack of Organizational Policy that reinforces the need for effective recovery measures
- Budgetary Considerations
BCDR Solutions must be comprehensive and customizable. This requires:
- Frequent vulnerability and risk assessments to evaluate the potential impact on personnel, assets and operations
- Measures to enhance vendor resiliency including their ability to cater to the organization’s business needs during an emergency
- Crisis communication policies that provide a two-way interactive module for information sharing, incident escalation and decision making
BCDR solutions must be exhaustive.
What needs to be recovered ?
Electricity Supply, Software, Hardware, Storage devices, Connectivity options, Servers and procedures
How are they to be recovered?
Options include
- In-house solutions, which can be constrained by limited expertise and domain knowledge
- External entitles that can provide specialized recovery solutions
- Informal agreements with other businesses for sharing of resources and leveraging a consolidate recovery effort
When are they to be recovered
Understanding system interdependencies is essential for effectuating a contextually coherent recovery solution. IT teams must also be precise while sequentially scheduling system restorations based on their recovery time and recovery point objectives.
Who will execute the recovery measure ?
Despite the growing presence of machine learning and automation capabilities in present day business systems, human intervention is still an integral component of responding to a business disruption and can’t be overlooked. Generic considerations include:
- Impact on resource availability
- Transport arrangements for commuting to alternate locations and data centers
- Facilitating virtual workplaces
Organizations store their recovery related information in multiple locations such as:
- Configuration Management Data Base (CMDB)
- Ticketing System
- Shared file space
IT teams often encounter friction while trying to consolidate all the data found in multiple locations in line with the enterprise’s BCDR policies.
While backup copies are important, organizations must also facilitate a standardized and consolidated version across all record instances that can be easily accessed.
Change Management
Besides inadequate preparation, many recovery solutions fail to achieve their objectives due to lack of regular maintenance. In a constantly evolving business landscape, enterprises must keep their recovery plans current through regular updates.
BCDR plan maintenance must be carried out along with configuration and change management drills. For instance, various components in the production environment, including hardware equipment and software applications, are constantly being updated. This includes adding and removing units as well as modifying settings.
For instance, consider the following scenario
Production Environment
- Servers – 60
- OS Units – 60
- Security Software Agents – 60
- Management Agents – 60
- Infrastructural Software Modules – 60
- Application Software Modules – 60
Recovery Facility
Servers – 30
- OS Units – 30
- Security Software Agents – 30
- Management Agents – 30
- Infrastructure Software Modules – 30
- Application Software Modules – 30
VMs – 60
- OS Units – 60
- Security Software Agents – 60
- Management Agents – 60
- Infrastructure Software Modules – 60
- Application Software Modules – 60
Effectuating updates across all the units, even once a month, would imply around 10,000 modifications annually. This can be hard to manage for understaffed IT teams. Not being able to synchronize production environments with recovery configuration can lead to severe vulnerabilities in the enterprise’s BCDR capabilities.
Apart from the production environment, there are other segments of business that need to be frequently updated such as:
- Business Processes
- Employee Personnel Management
- Third Party Contractors
Validation
Recovery plans need to be tested regularly so that:
- Business Continuity and Disaster Recovery capabilities are adequately mapped to the organization’s operations
- Reference documents are up to date with the most recent changes
- Plans can be deployed quickly and effectively
Process validation must go beyond conventional areas of inspection such as systems, applications, networks and so on. Evaluating BCDR capabilities must also emphasize on effective communication channels for sharing information during crisis situations and employee expertise levels for executing tasks.
Some of the main process validation focus areas include:
- Alert notifications systems
- Work from home options
- Alternate location facilities
- Shifting workloads to alternate locations
- Shifting workloads within main locations
- Provisioning Laptops
- Workaround Procedures
Testing methodologies include:
- Procedural Walk through
- Tabletop Exercises
- Simulation Drills
- Full Scale Plan Evaluation
Recommendations for Ensuring Effective Recovery
Visibility from a centralized dashboard that gives a consolidated view on
- Officially announced crisis situations, both current and resolved business disruptions
- Warning notifications that have been emitted to alert recovery teams and their corresponding status
- Incidents that have been archived along with their status and categorization. For instance:
- Earthquake
- Tremors
- Tsunami
- Volcanoes
- Eruption
- Landmass Movement
- Falling Rocks
- Avalanche
- Snow
- Debris
- Landslide
- Mud
- Lahar
- Debris
- Subsidence
- Abrupt
- Long Term
Recovery Management
- Emergency teams should be able to monitor tasks and activities through the entire life cycle of their execution – initiation, development and conclusion
- The BCDR solution must facilitate a consolidated snapshot of all the activities linked to a specific business disruption
- Tasks and activities can then be broken down based on status such as:
- Yet to Commence
- Work in Progress
- Completed
- Recovery
- Teams
- The status on various tasks should be updated based on the progress made. Dependencies between different tasks should also be taken into consideration. For instance, if there are two tasks, A and B, such that task B sequentially follows task A, the BCDR solution should not allow employee personnel to initiate task B until and unless task A has been completed
- Business disruptions are erratic by nature and the manner in which they evolve cannot be accurately forecasted. Hence, emergency teams must have the provision to add impromptu to the default list of activities. Default attributes of any task should include:
- Unique identifying name with description
- Department, such as accounts, IT Admin and so on
- Timeframe for completion
- Human resource in charge
- Alternate employee option
- Notes
- IT teams should also be able to decide categorization of the unplanned task that is being created, such as
- Primary, Secondary or Tertiary Task
- Dependencies with other tasks on so on
See for yourself how the application works
Witness our cloud based platform’s security capabilities in action
Play around with the software and explore its features
Compare and choose a solution that’s relevant to your organization
Consult our experts and decide on a pricing mechanism
Disasters
[carousel id=’1780′ items=’4′ items_desktop=’3′ margin_right=’5′ navigation=’false’] [item img_link=”https://www.stayinbusiness.com/wp-content/uploads/2016/02/Chemical-Spills-Discharges.jpg” href=”https://www.stayinbusiness.com/resource/disaster-recovery/chemical-spills-and-discharges/”][item img_link=”https://www.stayinbusiness.com/wp-content/uploads/2016/02/Riots-Public-Disturbances.jpg” href=”https://www.stayinbusiness.com/resource/disaster-recovery/riots-and-public-disturbances/”][item img_link=”https://www.stayinbusiness.com/wp-content/uploads/2016/02/Terrorism.jpg” href=”https://www.stayinbusiness.com/resource/disaster-recovery/terrorism/”] [item img_link=”https://www.stayinbusiness.com/wp-content/uploads/2016/02/worst-product-recall.jpg” href=”https://www.stayinbusiness.com/resource/disaster-recovery/product-recall/”] [/carousel]