SRE teams can prevent the decay of processes by creating high-quality documentation, but the most important keeping them updated. In this talk I am going to describe the various types of documents SREs create during the life cycle of the services and how they could keep updated automatically.
The software documentation is a key communication medium regarding the decisions on the project, and this includes the experiments with Chaos Engineering. Since CE is still a new discipline, there are not a framework about how we should document the discipline. In this talk I am goin to present one.
SREs main task is to keep the operations up and running. An SRE dealing with Kubernetes has many challenges to keep resilience is at the desired level and improving over time. In this talk we will go through techniques to measure and improve resilience of Kubernetes platforms in a Cloud-Native way.
Practicing Chaos Engineering and reproducing outages have taught us that the culture of postmortems must be open and blameless. That is difficult, in part, due to the social stigma associated with publicly acknowledging the contributions of persons to outages.
Priority access to all content
Video hallway track
Exclusive promotions and giveaways