Americas

Asia

Oceania

Bob Violino
Contributing writer

How to create an effective incident response plan

Feature
25 Feb 202511 mins
IT LeadershipIncident ResponseSecurity

To ensure minimal business disruption, CISOs must have the right incident recovery strategies, roles, and processes in place. Security experts share tips on assembling your playbook.

Programmer, teamwork and planning with business people in office for website developer, coding bootcamp and review. Mentor, collaboration and software engineer with group of employees in tech startup
Credit: PeopleImages.com - Yuri A / Shutterstock

When a company experiences a major IT systems outage — such as from a cybersecurity incident — it’s essentially out of business for however long the downtime lasts. That’s why having an effective incident response (IR) plan is vital.

It’s not just a matter of finding the source of an attack and containing it, though. Enterprises need to design for resilience to be able to continue operating even as key systems become unavailable.

What goes into an effective incident response plan? Here are some suggestions of essential components.

Perform impact analysis to ensure business resiliency and continuity

When a security breach brings down key systems, companies need to have a solid IT resiliency or business continuity (BC) plan in place. If the business is down for even a few hours that could lead to big financial losses and negative public relations.

“One of the key components of the development of a business continuity plan is to understand the essential functions your organization performs, and what the impacts would be if they were disrupted,” says Justin Kates, senior business continuity advisor for convenience store operator Wawa, who is responsible for architecting a new BC program for Wawa’s expanding footprint of more than 1,000 stores across 10 states.

“This is typically done through what is called a business impact analysis (BIA),” Kates says. “There are some in the business continuity space that think that the BIA is not a helpful tool, but in reality it helps the business continuity lead get a better understanding of how processes work across the organization.”

The BIA catalogs each process and determines what the impacts would be at certain intervals based on lengths of a business outage.

“I’ve seen a lot of success in using the BIA to determine which response plans are necessary to guide teams with workarounds if their typical applications and technology services are not working,” Kates says. Workarounds can include manual steps to perform the process or the use of alternative vendors or services to meet minimum requirements, he says. 

The time to determine which parts of the business are most essential to operations is not after an incident has happened, but well before.

“I find that the foundation of any effective incident response plan is to truly understand your business, from people to process to operations, through detailed and pragmatic impact analysis,” says Adam Ennamli, chief risk, compliance, and security officer at General Bank of Canada.

“When you talk about BIA and RTOs [recovery time objective], you shouldn’t be just checking boxes,” Ennamli says. “You’re creating a map that shows you, and your decision-makers, exactly where to focus efforts when things go wrong. Basically, the nervous system of your business.”

Many organizations treat all their systems as equally critical in practice, Ennamli says. “And when the rubber hits the road during an actual incident, precious time is wasted on less important assets while critical business functions remain offline and not bringing in revenue,” he says.

Establish a comprehensive post-incident communications strategy

Another key element that can make or break an incident response strategy is communications. Without clear communications among the major stakeholders of the business, a company might experience much longer downtimes or the loss of vital processes for extended periods.

“How are you going to go about communicating? With whom? When?” Ennamli asks. “And it’s not just about having a phone tree or a list of email addresses. You want pre-approved content blocks and templates for different scenarios depending, [multiple] backup communication channels, and clear decision and delegation structures for who can say what to whom.”

When an incident occurs, “the last thing you want is to be wordsmithing press releases, [sending] mass emails, or trying to figure out how to reach your team because your primary channels are down,” Ennamli says.

It’s vital to have robust communication protocols, says Jason Wingate, CEO at Emerald Ocean, a provider of brand development services. “You’re going to want a clear chain of command and communication,” he says. “Without established protocols, you’re about as effective as trying to coordinate a fire response with smoke signals.”

The severity of the incident should inform the communications strategy, says David Taylor, a managing director at global consulting firm Protiviti. While cybersecurity team members actively responding to an incident will be in close contact and collaborating during an event, he says, others are likely not as plugged in or consistently informed.

“Based on the assigned severity, stemming from the initial triage or a change to the level of severity based on new information during the response, governance should dictate the type, audience, and cadence of communications,” Taylor says.

This allows cybersecurity and other leaders to leverage a consistent timeframe from which to expect updates, Taylor says. “In concert, this enables the technical response teams to focus on the response without stopping progress to provide updates in an ad-hoc manner,” he says.

One of the most important steps is appointing a communications lead as part of the incident management structure, Kates says. “When technology systems are unavailable, many within the organization will need to implement workarounds to keep essential processes going,” he says. “Many of the decisions they make are based on updates that are being provided on the status of the incident and expected resolution times.”

The technology teams will be focused on mitigating the impacts of the incident and might not have the time to provide updates, Kates says. “Your plans should outline who will take the lead in sharing updates with internal and external stakeholders, including even updating them when there may not be any new information,” he says,

Structure teams with clearly defined response roles and workflows

It’s important to understand who’s responsible for what following an incident.

“When a cyber incident hits, confusion is your biggest enemy,” Wingate says. “A team without defined roles is going to be running around like an orchestra without a conductor. They all may be technically skilled, but they’re all playing different songs. When incidents occur, confusion costs time, and when an incident does occur, time is everything.”

Structure and roles should go beyond the cybersecurity or IT staffs. “The biggest myth in cybersecurity is that it’s just an IT problem,” Wingate says. “Modern cyber incidents are business incidents, and treating them otherwise is like having a fire escape plan that only one person knows about in the building.”

The IR structure and roles ideally should include representatives from across the enterprise.

The key cybersecurity roles are the CIO/CTO, CISO, incident commander, incident coordinator, endpoint analyst, network analyst, and external forensics support, among others, Taylor says. Roles outside of cybersecurity should include the crisis management team and possibly representatives from legal, corporate communications, human resources, finance, and others, depending on the extent of the incident.

“Defining who sits in each of these roles is key, with associated responsibilities that should also be clearly defined and easily referenced in relevant plans,” Taylor says.

It’s also important for IR plans to identify key external stakeholders, says Rocco Grillo, a managing director at business advisory firm Alvarez & Marsal Disputes and Investigations and head of the firm’s global cyber risk and incident response services practice.

This includes outside counsel, IR and forensics investigation firms, cyber insurance contacts, notification and credit monitoring firms, law enforcement, and ransomware negotiation firms, Grillo says.

Understand the totality of your threat landscape

The cybersecurity threat landscape is broad and complex, and effective IR strategies need to be designed to address this complexity. Attacks can come from a growing number of sources and affect not just an enterprise, but its suppliers and other business partners as well.

“More focus is being concentrated on supply chain attacks as opposed to direct hits on companies,” Grillo says.

Supply chain attacks are “akin to a burglar breaking into a building’s superintendent office to get the keys to allow entrance into all of the apartments in the building, versus a burglar breaking into only the penthouse of the building to take the crown jewels,” Grillo says.

In addition, IR plans need to focus not just on external threats but also on

insider threats

“Insider threat risks are not only limited to malicious employees, but also employees who commit acts of human error and/or unknowingly create cyber risk exposures to their companies that threat actors are able to exploit,” Grillo says.

Third-party vendors and suppliers fall into the insider threat category, Grillo says.

“Third parties can have authorized access to a company, and when [they are] compromised by a threat actor, they inadvertently create a ‘draw bridge’ for threat actors into the companies that the third party is contracted with,” he says.

Conduct continuous testing and regular reviews

Enterprises need to test their incident response and business continuity plans when they first create them and then on a regular basis, to ensure they are effective.

“This really shouldn’t have to be said; it’s like everything in else in tech, test it first,” Emerald Ocean’s Wingate says. “If you jump out of a plane, you’d probably want to make sure your parachute was checked first. You don’t want to find out it doesn’t work as your hurling out of a plane.”

One of the reasons why regular testing is so important is that the cybersecurity landscape is constantly shifting.

“In my experience, the key to effective recovery is treating your incident response plans as living, mental playbooks rather than static documents, and regularly stress testing your assumptions,” General Bank of Canada’s Ennamli says. “The pivot is moving beyond theoretical planning to practical, tested steps that have been proven to work under pressure.”

Following any security incident, enterprise IR and BC teams need to conduct reviews to see how well plans were executed and where improvements can be made.

“Recovery from an incident [and] exercises of the incident response program must be followed by a disciplined lessons-learned effort,” Protiviti’s Taylor says. “These are commonly referred to as after-action reviews [AARs], post-incident reviews [PIRs], hotwashes, or debriefs. Regardless of label, a disciplined and documented approach of managing both positives and [negatives] post-incident is paramount to continuous improvement.

Stress simplicity and modularity wherever possible

Although the threat landscape is complex, IR and BC strategies don’t need to be. Sometimes, simpler is better.

“We typically see organizations craft numerous, hundred-page binders for their emergency plans, one for incident response, another for business continuity, another for disaster recovery, etc.,” Wawa’s Kates says. “Most of these plans have significant overlap and are just copied templates they have found online.”

Instead of creating separate, cumbersome plans for each type of incident, Kates has adopted a modular, “playbook” approach.

“You can develop a few hazard-specific playbooks — ransomware, power outage, severe weather — that can plug and play common functions of incident response [such as] communication, situation assessment, business process workarounds.” Kates says.

This approach allows teams to activate and combine relevant plays based on an incident’s nature, creating a more useful plan, Kates says.

“I’ve found it’s also far simpler than maintaining multiple large plans, ensuring information remains current,” he says. “Playbooks include checklists and decision trees to guide responders through complex procedures, reducing cognitive overload during a crisis.”

See also: