Failure in the cloud is a reality, still less than 25% of business are prepared for such outages. It is inevitable systems fail; things burn.
The idea behind Chaos Engineering is proving system can withstand failure by testing it, causing controlled harm to check and fix any potential issues.
This blog post is just a collection of links on the topic:
Blog | Docs
- Chaos Engineering: A cheat sheet
- Microsoft: Chaos engineering - Azure Architecture Center
- Microsoft: Advancing resilience through chaos engineering and fault injection
- Microsoft: Inside Azure Search: Chaos Engineering
- Microsoft: Induce controlled Chaos in Service Fabric clusters
- Microsoft: Shift Right to Test in Production
Tools
- Azure: A node SDK for building services capable of injecting chaos into PaaS offerings
- Chaos Toolkit has both an Azure driver (Azure - Chaos Toolkit) and a Service Fabric driver (Service Fabric - Chaos Toolkit).
- A tool for introducing chaos into the Azure PaaS using configurable extensions
- Chaos HTTP Proxy
- Creating Reliability Through Chaos With Azure VMs and Gremlin