CrowdStrike / Windows Outage

What happened?

There are reports that an update to CrowdStrike Falcon Sensor, relating to a faulty channel file, on Windows machines caused some of these systems to crash with a BSOD and then enter a boot loop, preventing the systems from operating normally. There is no evidence that this is the result of a cyber attack or hacking event.

CrowdStrike has issued a statement here: https://www.crowdstrike.com/blog/statement-on-falcon-content-update-for-windows-hosts/

The support portal link (available for CrowdStrike customers only) is here: https://supportportal.crowdstrike.com/s/article/Tech-Alert-Windows-crashes-related-to-Falcon-Sensor-2024-07-19

Wikipedia page about the incident and various impacts can be found here, which links to several articles and sources:

https://en.wikipedia.org/w/index.php?title=2024_CrowdStrike_incident

What can be done about it?

Reports indicate that the root cause has been addressed and if systems are rebooted and able to receive the remediated content update they should recover. Unfortunately, where crashes mean impacted systems are unable to stay online to receive the new content update there is no easy fix to the problem. It is not clear at the time of writing what proportion of impacted systems are unable to recover themselves via rebooting.

A workaround has been suggested by CrowdStrike in their statement:

Workaround Steps:
1. Boot Windows into Safe Mode or the Windows Recovery Environment
2. Navigate to the C:\Windows\System32\drivers\CrowdStrike directory
3. Locate the file matching “C-00000291*.sys”, and delete it.
4. Boot the host normally.

This workaround has been reported as effective by several sources; we have not independently verified it ourselves.

AWS

Amazon AWS has posted advice for users impacted here: https://repost.aws/en/knowledge-center/ec2-instance-crowdstrike-agent

Microsoft Azure

Microsoft has posted advice for Azure users on their status page here: https://techcommunity.microsoft.com/t5/azure-compute-blog/recovery-options-for-azure-virtual-machines-vm-affected-by/ba-p/4196798

Google Cloud

Google has posted advice for Google Cloud users on their incident page here: https://status.cloud.google.com/incidents/DK3LfKowzJPpZq4Q9YqP

Recovery

For businesses heavily impacted, recovery may take a significant time period - especially if many hosts need manual intervention via the workaround above to recover. Where many remote users are impacted, end users may need to be talked individually through the workaround process above by IT engineers. Where impacted equipment is deployed remotely, individual engineer visits to each impacts system may be required.

Improve your security