SITE RELIABILITY 
Engineering (SRE)

JULY 30 2020

ONLINE
SESSION

  • Twitter
  • YouTube
  • LinkedIn
  • Facebook

HIGHLIGHTED TALKS

Driving Service Ownership with Distributed Tracing

TALK BY DANIEL "SPOONS" SPOONHOWER

CTO & CO-FOUNDER @ LIGHTSTEP

While many organizations are rolling out Kubernetes, breaking up their monoliths, and adopting DevOps practices with the hope of increasing developer velocity and improving reliability, it’s not enough just to put these tools in the hands of developers: you’ve got to incentivize developers to use them. Service ownership provides these incentives, by holding teams accountable for metrics like the performance and reliability of their services as well as by giving them the agency to improve those metrics.

From application to product ownership: an SRE team's journey

TALK BY NIKOLAUS RATH

TECH-LEAD @ GOOGLE SRE TEAM IN LONDON

In the past, my team was responsible for specific applications/ executables. Now, we are responsible for specific for end-user workflows, no matter which executables they involve. I will describe the technical and social changes that necessitated and enabled this change of paradigm. At the end of 2019, our team supported in the order of 200 executables/microservices that provide functionality for Google's products in the advertising space.

This number was the result of continuous growth over more than 10 years.

SREs love Chaos Engineering

TALK BY MIKOLAJ PAWLIKOWSKI

SOFTWARE ENGINEER PROJECT LEAD

@ BLOOMBERG LP

What's Chaos Engineering? Is it part of SRE? Is it breaking things randomly in production? In this talk, we'll try to settle these questions once and for all, and to give you a life-like demo of what Chaos Engineering looks like in practice!

Building and Leading Remote Teams

TALK BY AMBER VANDERBURG

BUSINESSPERSON, COACH & SPEAKER

FOUNDER OF THE PATHWAYZ GROUP

The world of work is constantly changing as we create new products, provide excellent service, and collaborate on new ventures. I'll give you tools to overcome remote team challenges from confronting communication frustrations, setting expectations, and strategically building/equipping the right-fit remote team.

Applied Security: Crafting Secure and Resilient Distributed Systems using Chaos Engineering

CO-TALK BY

AARON RINEHART, CTO @ VERICA

& JAMIE DICKEN, MANAGER OF SECURITY ENGINEERING @ CARDINAL HEALTH

Join Jamie Dicken and Aaron Rinehart to learn about how they implemented Security Chaos Engineering as a practice at their organizations to proactively discover system weakness before they were taken advantage of by malicious adversaries.

In this session Jamie and Aaron will introduce a new concept known as Security Chaos Engineering and share their experiences in  applying Security Chaos Engineering to create highly secure, performant, and resilient distributed systems.

Machine Learning in Production:

An Intro to MLOps

TALK BY RYAN DAWSON

CORE MEMBER @ SELDON OPEN SOURCE TEAM

Reliably deploying and maintaining machine learning applications is complex. There's a dizzying array of tools and they look different from the usual DevOps tools. To apply SRE skils to ML, we need to understand the specific challenges of ML build-deploy-monitor workflows. We'll use reference examples to understand the cycle in terms of data prep, training, rollout and monitoring. We'll see that some key challenges relate to training models from slices of large and varying data domains - a problem alien to the mainstream DevOps world.

Security Chaos Engineering: Considerations for Gamedays when the Experiments are Cyberattacks

TALK BY YURY NINO ROA

SRE @ AVAL DIGITAL LABS

Chaos Gamedays have been successfully probed in the training of operations and on-call teams. However, they have not been explored completely when the failures are related to cyberattacks. In this talk we are going to explore how to adapt the methodology for Chaos Gamedays whit security experiments.

Increasing Kubernetes Resilience
for an SRE

TALK BY UMA MUKKARA

COO @ MAYADATA

SREs main task is to keep the operations up and running. An SRE dealing with Kubernetes has many challenges to keep resilience is at the desired level and improving over time. In this talk we will go through techniques to measure and improve resilience of Kubernetes platforms in a Cloud-Native way.

Tinkerbell:
An Automated Bare Metal Provisioning Engine

TALK BY AMAN PARAULIYA

SENIOR SOFTWARE ENGINEER

@ INFRACLOUD TECHNOLOGIES

In Cloud Native world, bare metal servers are critical for performance & security related applications. Tinkerbell solves the problem of provisioning and lifecycle management for bare metal. Starting with bare metal concepts, we will cover provisioning from small IOT devices to big rack servers.

The Innovation Ninja

TALK BY AMBER VANDERBURG

BUSINESSPERSON, COACH & SPEAKER

FOUNDER OF THE PATHWAYZ GROUP

Organizations must be innovative to be competitive, valuable, and relevant. In you’re like many organizations today, you might be wondering how our season of widespread isolation will affect the product and service innovation in your organization. Limits can encourage and inspire creativity, but there’s also proactive steps that you can take to unlock creative genius in individuals and in teams.

GOLD SPONSORS

SILVER SPONSORS

PARTNERS