Conf42: Chaos Engineering 2021

...

One year of SRE failures

Bart Enkelaar
Lead SRE @ bol.com

Bart Enkelaar's LinkedIn account Bart Enkelaar's twitter account



Last year we pitched SRE to our management team and got the OK to get cracking. We’ve achieved a lot, but failed even more. This talk is a front-row seat to a blameless postmortem on the first year of SRE at bol.com, the largest online retailer in The Netherlands and Belgium.

Getting SRE right is hard. We tried in 2017, we failed. We tried in 2019, we failed. We picked ourselves up, dusted off, took the learnings and tried again in 2020. This time we’re here to stay. bol.com is the largest online retailing platform in the Netherlands and Belgium. We have about 10 million active daily users and innovate with about 700 software engineers.

This is the story of a scrappy team of Site Reliability Engineers trying to shift a large enterprise (about 3000 employees) to think SRE. Our successes, but far more interestingly, our failures and learnings.

Awesome tech events for

Priority access to all content

Community Discord

Exclusive promotions and giveaways