Conf42: Site Reliability Engineering 2021

...

Let the machines optimize the machines: ML-driven automated performance tuning

Stefano Doni
CTO @ Akamas

Stefano Doni's LinkedIn account Stefano Doni's twitter account


SREs’ main goal is to achieve optimal application performance, efficiency and availability. A crucial role is played by configurations (e.g. JVM and DBMS settings, container CPU and memory, etc): wrong settings can cause poor performance and incidents. But tuning configurations is a manual and lengthy task, as there are 100s of settings in the stack all interacting in counterintuitive ways.

In this talk, we present a new approach that leverages machine learning to find optimal configurations of the tech stack. The optimization process is automated and driven by performance goals and constraints that SREs can define (e.g. minimize resource footprint while matching latency and throughput SLOs). We show examples of optimizing Kubernetes microservices for cost efficiency and latency tuning container sizing and JVM options.

With the help of ML, SREs can achieve higher application performance, in days instead of months, and have a lot of fun in the process!

Awesome conferences for

Priority access to all content

Community Discord

Exclusive promotions and giveaways