Conf42: Site Reliability Engineering 2022

...

Alerting on SLOs and Error Budget Policies

Ricardo Castro
Lead SRE @ Anova

Ricardo Castro's LinkedIn account Ricardo Castro's twitter account


Assessing your system’s reliability through SLOs is a great way to really understand and measure how happy users are with your service(s). Error Budgets give you the amount of reliability you have left before users are unhappy. Ideally, you want to be alerted way before users are dissatisfied and take the appropriate measures to ensure they aren’t. How can you achieve that?

That’s where alerting on SLOs and Error Budget Policies come into the picture. By tracking how happy your users are, through SLOs, and alerting way before their level of insatisfaction reaches critical levels you’ll be able to define policies to deal with issues in a timely manner, ensuring operational excellence.

Awesome conferences for

Priority access to all content

Community Discord

Exclusive promotions and giveaways