CISOs should look to proactively incorporate new lessons in their incident response, disaster recovery, crisis communications, and contingency workforce playbooks — and revisit agreements with software providers. Credit: Ground Picture / Shutterstock At 4 a.m. UTC on July 19, cybersecurity giant CrowdStrike sent out what it thought was a routine content configuration update to its Falcon product, which analyzes internet connections for malicious behavior. A worldwide IT outage of unprecedented severity and scale then unfolded, which will be the subject of intense scrutiny and potential legal action for years to come. In a cascading series of events, a vast swath of the global community — airlines, hospitals, banks, healthcare facilities, governments, and more — was brought to a standstill. This crisis is still unfolding because system administrators and other tech workers continue to restore their organizations’ operations. But even at this early stage, CISOs can take advantage of the situation to enhance their organizations’ resiliency and reliability in the long run. “What we know is CrowdStrike kind of messed up on three things,” David Brumley, CEO of ForAllSecure, tells CSO. “CISOs should be looking at those three things and at their own services and software that they publish to make sure they get these three things right. “The first thing is CrowdStrike had software vulnerabilities inside the driver code. The second thing is they didn’t do sufficient testing of exactly what customers are going to see. The third thing is they didn’t do a staggered rollout. They updated everyone at once.” According to CrowdStrike’s preliminary post-incident review, the update was a content configuration for the Windows sensor used to gather telemetry on possible novel threat techniques. Due to a bug in the update’s content validator, problematic content was passed on despite containing an error. Consequently, the update resulted in an out-of-bounds memory read, disrupting at least 8.5 million Windows machines, according to Microsoft’s latest estimate. In its review, CrowdStrike promised to improve its testing program by introducing content update and rollback testing. It also pledged to implement staggered deployment for content, in which updates are deployed gradually to larger portions of its base, starting with a canary deployment to a small group of users. CISOs can view CrowdStrike’s errors as object lessons that can help them better deal with incidents in the future. They can also take advantage of the crisis to adopt other beneficial strategies, such as updating incident response playbooks, updating disaster recovery plans, revising software provider contracts, and preparing for better communications and contingency workforces before the next major crisis hits. Treat the incident as a learning experience Although not a cybersecurity incident, CISOs should view the CrowdStrike outage as comparable to one. Jamie Boote, principal consultant for Synopsys Software Integrity Group, tells CSO, “I would encourage firms to look back, not just at this instance but also at other security outages, ransomware, recovery, all of that. Treat it as an opportunity to practice and say, ‘How would we perform if this big issue came and be able to deal with it in the future?’ That way, you’re not figuring it out as you go along.” Moreover, experts say this kind of software error will almost certainly occur again. “We should expect it to happen again, and you need to protect against it,” Ranjan Singh, chief product officer at Kaseya, tells CSO. “There are humans involved in the entire chain of development, so invariably, there’s always room for error. But it’s our job to make sure that we go to the ends of the earth and figure out how to prevent something like this, especially in critical products.” ForAllSecure’s Brumley says this kind of incident will “absolutely” happen again. “Huge” industry consolidation with fewer and fewer vendors will mean that “more and more people will be affected when the next big software error occurs,” he says. Security workforces that are stretched thin will only worsen the industry’s ability to respond next time. “I think people are getting tired of security, and especially with the markets changing, there’s been a huge security workforce reduction,” he says. Time to revisit disaster recovery plans One risk management component that CISOs should revisit now is disaster recovery. “I think a lot of companies probably got to run their disaster recovery process during the CrowdStrike outage, but not willingly, not voluntarily,” Christine Gadsby, CISO of BlackBerry, tells CSO. She recommends CISOs pay particular attention to a vendor backup system. “What if you’re dependent on only that one vendor, and they go down? Well, then, you’ve got a dependency on that one vendor that just crashed society. So, have a vendor backup system in your disaster recovery plan and understand those vendors’ risk assessments.” “Every organization has to have a disaster recovery plan of some sort,” Kaseya’s Singh says. “This incident took out lots of end users. It also took out critical business systems. You must assume things will go wrong, and that’s what disaster recovery scenarios back up, the ability to spin up your critical business systems.” Trust but verify software providers The wave of outages involved two of the most highly trusted software companies, CrowdStrike and Microsoft, meaning CISOs can’t automatically rely on software provider reputation to avoid disasters. Gadsby says, “Even if their names bring the utmost confidence to us, we should trust but verify because it is a human aspect, and humans make mistakes.” She adds, “I have been surprised at the number of people I’m talking to that don’t hold their vendors accountable for SLAs [service level agreements], or they haven’t thought about that, or they don’t know what they are. But if you have vendors you know are tier-one vendors, then you need to have an SLA with them and figure out what they are going to do if they do cause an outage. Maturing the vendor risk management program is critical.” It is also crucial to work with the board and C-suite executives to educate them on any changes needed to vendor risk management. “Every CEO and board should be asking if they have a vendor accountability process in place with SLAs for what happens in times of crisis,” she says. In working with vendors, “You have to determine what policies you want to put in place,” Singh says. “You want your security posture to be up to date at all times. Just make sure you are responsive to any updates you’re receiving. Go through the evaluation process, the contractual process, and all the due diligence to ascertain and establish their update policy.” Crisis communication and boots on the ground are critical CISOs should also re-examine their crisis communications plans, particularly internal communications. “Crisis communications is essential,” Gadsby says. “There are so many companies that this happened to, and they couldn’t even talk to their employees. If you as a company do not have a crisis communication platform, it is high time to start looking for one to make sure you can reach your employees in alternate channels when stuff goes down.” Having enough boots on the ground is also important to help remediation. At the start of the CrowdStrike outage, both CrowdStrike and Microsoft advised organizations to manually remove the faulty update, machine by machine, an impossible task for some organizations with tens of thousands of computers. Microsoft ultimately released a program to remove the faulty updates in an automated fashion. “Do all the prep and think about your communication strategy, but I also think there is a pretty interesting nugget for this particular type of incident,” Brumley says. “You had to have people on the ground to help with your systems. There’s this trend of going outsourced and making everything a Google service. But it’s hard to recover from these incidents when you don’t have anyone on the ground. You almost have to keep a backup force for this, at least some minimal one, to do it.” SUBSCRIBE TO OUR NEWSLETTER From our editors straight to your inbox Get started by entering your email address below. Please enter a valid email address Subscribe