On July 19, 2024, a global update from CrowdStrike, a US-based cybersecurity provider of Endpoint Detection and Response (EDR) services, triggered a massive IT disaster worldwide. This update resulted in over 8.5 million Windows 10 systems encountering a Blue Screen of Death (BSOD) and caused an estimated $5.4 billion in losses, according to cyber insurer Parametrix. This failure occurred because the agent’s kernel driver, a component of CrowdStrike’s EDR solution, encountered an error. The system would then enter an infinite reboot loop, alternating between BSODs and reboots. This made it impossible to downgrade or upgrade the agent.
Most EDR solutions require a kernel driver to monitor all events on a device. While kernel drivers provide the necessary visibility for detecting malicious activities and responding more effectively, they also pose a significant risk of causing BSODs if an error occurs.
Many people are wondering how such a large-scale failure could have happened. There are various reasons cited for this large-scale failure, but the following two are believed to be the most significant:
- Frequently updated diagnostic logic located in the kernel driver.
- Simultaneous deployment of diagnostic logic updates without gradual rollout.
Genians delivers more reliable and secure EDR solutions, called Genian EDR in these two aspects.
4 Key Reasons Why Genian EDR is Safe
-
Agent Architecture
Unlike traditional EDR solutions that rely heavily on kernel-level operations, Genian EDR offloads critical detection logic to the application layer, significantly reducing the risk of system crashes (BSODs).
-
Gradual Deployment for Verification
Urgent updates, like antivirus pattern updates, often skip phased rollout, leading to large-scale disruptions. Genian EDR employs a 3-tier structure:
This structure with a phased deployment allows for thorough testing and minimizes the impact of potential issues before broader deployment. -
Robust Safety Features
Genian EDR provides built-in error-avoidance mechanisms:
- An "infinite reboot prevention function" to stop agent operation in case of repeated BSOD.
- Automatic rollback or temporary suspension in case of repeated crashes.
- Automatic agent restarts if a deadlock is suspected.
- Avoidance of simultaneous process/file occupation to prevent conflicts
-
Rigorous Quality Assurance
Another common criticism raised by experts who analyzed the cause of this incident is the "absence of a QA process." Due to the nature of security software that responds to malicious behavior, it is almost impossible to filter out all errors through manual testing. Genian EDR can detect basic and critical errors in advance through an automated QA system in addition to manual testing by developers before a new version is released. Furthermore, changes to the Genian EDR functions are managed internally in three stages: BETA, RC, and RELEASE, and a verification and testing period of approximately two months is required before the modified function is applied to customers.
Genian EDR Key Benefits
Genian EDR reduces the risk of catastrophic system failures by prioritizing system stability in both design and operation. It achieves this through:
- Mitigates Catastrophic Failures: Reduces the risk of system-wide disruptions associated with kernel-level EDR solutions.
- Application-Level Detection: Offloads critical detection logic to the application layer, minimizing the chance of BSODs caused by kernel driver instability.
- Robust Update Process: Implements staged deployments and rigorous QA for secure and reliable updates.
- Enhanced Security and Reliability: Offers a more secure and dependable EDR solution overall.
To learn more about the solution, contact edr-biz@genians.com