Problem Description:
Allstate, a Fortune 100 Insurance company, reached out to AVM to help them mature their observability strategy and implementation.This was a key initiative of their digital transformation strategy. Allstate wanted to achieve total observability and AVM collaborated to make it happen by reviewing existing patterns, involving all teams, and working through the launch of a new product
Solutions Highlights:
Allstate was already using Datadog for observability but realized that they were not using the tools at full potential. They wanted actionable dashboards that would allow their teams to quickly source and mitigate issues. This included end to end performance monitoring leveraging real user monitoring (RUM), synthetic tests, logs, application and infrastructure monitoring to trace customer requests through the entire architecture including cloud services, kubernetes and asynchronous, Kafka based, integrations. These all added to the complexity. AVM was able achieve end to end traceability and created comprehensive dashboards.
The initial priority was production to reduce the number of critical incidents. AVM then went beyond the production environment to include test and dev environments and CI/CD pipelines in order to catch issues as early as possible. Chaos testing was introduced to detect gaps in both the architecture and the monitoring.
AVM conducted a full audit of existing dashboards, monitors and alerts then worked with the Allstate team to focus them on what was meaningful and actionable. This required AVM’s extensive knowledge of cloud native AWS solutions, messaging and service based architectures, and observability best practices. AVM wanted to ensure that there were tagging standards to aid in creating and navigating dashboards and metrics. These standards needed to account for many observability lens – application performance, security, cloud spend, devops metrics