Chaos Engineering

FEBRUARY 25 2021

6PM CET | 12PM ET

ONLINE

VOD
EVENT

  • Twitter
  • YouTube
  • LinkedIn
  • Facebook

KEYNOTE

MIKOLAJ.png

Chaos Engineering in 2021

Mikolaj Pawlikowski

Engineering Lead @ Bloomberg

Author "Chaos Engineering: Site reliability through controlled disruption"

HIGHLIGHTED TALK

leonid_00000.png

Chaos Engineering

+ Generic Mitigations:

The Path to Self-Healing Systems

Leonid Belkind

CTO @ StackPulse

TRACKS

Register to watch all the content on the kick-off date
Unsub any time.

It's FREE!

 
 

FIRST NAME

LAST NAME

E-MAIL

JOB TITLE

/ MAIN INTEREST

COMPANY / UNIVERSITY

/ COMMUNITY

PHONE NUMBER

SECURITY

Shift up:

Continous Security and feedback loop in production

SWAPNIL DESHMUKH

CTO @ CERTUS CYBERSECURITY

For a successful implementation of DevSecOps, there is an immediate need to identify application, and infrastructural security gaps in production (e.g. by generating chaos or via dynamic security testing) and remediate them in on going fashion called shift up.

Cyber Chaos Engineering: How to Implement Security without a Blindfold

AARON RINEHART

CTO @ VERICA

& DAVID LAVEZZO
DIRECTOR OF CYBER
CHAOS ENGINEERING @ CAPITAL ONE

The complex ordeal of delivering secure and reliable software will continue to become exponentially more difficult unless we begin approaching the craft differently.


Enter Chaos Engineering, but now also for security. Instead of a focus on resilience against service disruptions, the focus is to identify the truth behind our current state security and determine what “normal” operations actually look like when it's put to the test.

The speed, scale, and complex operations within modern systems make them tremendously difficult for humans to mentally model their behavior. Security Chaos Engineering is an emerging practice that is helping engineers and security professionals realign the actual state of operational security and build confidence that it works the way it was intended to.

Leveraging the crowd power to regain faith in Internet’s zero trust architecture

THIBAULT KOECHLIN
CTO @ CROWDSEC

Worldwide spending on cybersecurity is predicted to reach $1 trillion in 2021. But companies will keep being hacked.
A new approach is needed and the crowd could be the key.


We will explore why a collaborative approach to security could contribute to solving the problem.

 

Risk-Driven Fault Injection: Security Chaos Engineering for the Fast & Furious

KENNEDY TORKURA

CLOUD SECURITY ENGINEER
@ MATTERMOST

The dynamic nature of cloud-native infrastructure requires continuous security mechanisms to effectively detect security threats, especially those with unknown patterns and behavior.

 

This talk proposes Risk-driven Fault Injection (RDFI) techniques to address these challenges.

Securing the Cloud: Empowering Developers to practice Security Chaos Engineering

YURY NINO ROA

& JHONNATAN GIL CHAVES

DEVOPS ENGINEERS 
@ AVAL DIGITAL LABS

Cloud platforms face security issues that are frequently a matter for security engineers, not for developers. As a result, security is treated as separate from development. Security Chaos Engineering offers a methodology to bring to developers to leverage the power of security in their roles.

Attacking & Defending Mobile Apps

ROMANSH YADAV

SENIOR SECURITY CONSULTANT
@ APTIVA CORP LLC

The talk aims to teach attendees Android & iOS application security from basic level to advanced.

I will cover architecture, file system, security model, application components, OWASP mobile top 10, Mitigation, toolset, frameworks, techniques used to identify, analyse and exploit vulnerabilities.

DEEP DIVE

Chaos Engineering:

When The Network Breaks

TAMMY BRYANT BUTOW

PRINCIPAL SRE
@ GREMLIN

Chaos Engineering is a disciplined approach to identifying failures before they become outages. By proactively testing how a system responds under stress, you can identify and fix failures before they end up in the news. Chaos Engineering lets you compare what you think will happen to what actually happens in your systems.

This talk will share how you can accelerate your understanding of how your network can break (packet loss, blackhole attacks, latency injection, and packet corruption) and impact your services.

Forecasting-based

proactive optimization of cloud resources

PAWEL SKRZYPEK

SOFTWARE ARCHITECT

& ALICJA RENIEWICZ

FULL STACK ENGINEER
@ 7BULLS.COM

Novel concept of advanced adaptation of cloud resource using predicted demand for Cloud resources. Presented approach is using advanced forecasting methods combined with machine learning based solvers, which can dynamically adapt to changed workload. The benefits of that approach will be shown.

How to avoid breaking other people's things

LISA KARLIN CURTIS

TECH LEAD
@ GOCARDLESS

Unless you know all the assumptions the consumers of your API have made, it’s impossible to reliably avoid breaking their software. Assuming you, like me, can’t read minds, what can we do to try and keep the number of sad developers to a minimum?

Cloud Native Chaos Engineering at scale

UMA MUKKARA

CEO @ CHAOS NATIVE

SREs main task is to keep the operations up and running. An SRE dealing with Kubernetes has many challenges to keep resilience is at the desired level and improving over time. In this talk we will go through techniques to measure and improve resilience of Kubernetes platforms in a Cloud-Native way.

Maximizing Error Injection Realism for Chaos Engineering

with System Calls

LONG ZHANG

PHD STUDENT
@ KTH ROYAL INSTITUTE
OF TECHNOLOGY

Some of the perturbation models for chaos engineering are based on a random strategy such as ChaosMonkey. However, realistic perturbations could also come from errors that have naturally happened in production. I would like to share more about how to improve the realism for CE experiments.

Normalizing Chaos:

A Cloud Architecture for Embracing Failover

RYAN GUEST

SOFTWARE ARCHITECT 
@ SALESFORCE

What if instead of designing cloud architectures where failover is an exceptional case, we embraced failover as a normal part of running and system and failed over all the time?

Let's deep dive into an architecture currently in production doing just that and share lessons learned along the away.

Sensory Friendly Monitoring: Keeping the Noise Down

QUINTESSENCE ANX

DEVELOPER ADVOCATE
@ PAGERDUTY

As infrastructure increases in complexity and monitoring increases in granularity, engineering teams can be notified about each and every hiccup in each and every server, container, or process. In this talk, I’ll be discussing how we can stay in tune with our systems without tuning out.

Taming the spatio-temporal-causal uncertainty in
Chaos Engineering and Observability

MAHESH VENKATARAMAN

MANAGING DIRECTOR

@ ACCENTURE

There are 2 challenges in observability. Uncertainty in prognosis decisions (false+ and false- in failure predictions) and discovering causal connections in diagnosis. We address this by modeling spatio-temporal uncertainty for prognosis& knowledge representation/ graph database for causal diagnosis

LESSONS LEARNED

5-Technology Trends and Opportunities for Start-ups & Fortune 500 Companies

DERRIS BOOMER

FOUNDER @ BOOMERTECHGROUP

My talk will review hot jobs and skills to acquire. Lastly talk will cover how start-ups and corporate workers can use these technologies.

Technology such as Artificial Intelligence (AI), Internet of Things(IoT), Machine Learning, Software-as-a-Service(SAAS) and Big Data. Talk will review hot jobs and skills to acquire. Lastly talk will cover how start-ups and corporate workers can use these technologies.

Clear the ring for Chaos Engineering at Vertrieb Deutsche Bahn! One year sensations and attractions!

MAIK FIGURA

IT CONSULTANT

@ CODECENTRIC AG

& OLIVER KRACHT
IMPLEMENTATION LEAD

@ DEUTSCHE BAHN VERTRIEB GMBH

Have you ever been to the circus?

As software developers, we have much more in common with circus trainers than we think. On the one hand, we have to tame and maintain a steadily growing zoo of technologies and on the other hand, the undesirable audience expects us to show more and more astonishing features within a short period of time. On top of that, we’re also shooting right into the circus ring with our big CI/CD cannon. That can lead to interesting effects, because we’re shooting while a show is running.

Let's talk about some experiences, learnings and failures. Your asking about our learning environment? Well there you go: 300+ microservices, 100+ developers, 100+ Gamedays, countless experiments with various outcomes :)

In the kitchen: A sprinkle of fire and chaos

ANA MARGARITA MEDINA

SENIOR CHAOS ENGINEER

@ GREMLIN

How do you ensure your food tastes good? Or maybe you don't and everyone hates your cooking. How do you learn to avoid burning or cutting yourself while cooking? Is the kitchen on fire or is the smoke alarm just complaining that it needs a new battery?

What does all of this have to do with the cloud? Chaos and Learning! She will talk about her favorite ingredient for shipping resilient cloud-native applications.

Learn by fire! Chaos and Learning!

One year of SRE failures

BART ENKELAAR

LEAD SRE @ BOL.COM

Last year we pitched SRE to our management team and got the OK to get cracking.

We've achieved a lot, but failed even more.

This talk is a front-row seat to a blameless postmortem on the first year of SRE at bol.com, the largest online retailer in The Netherlands and Belgium.

Organizational Chaos and recipes for Service Ownership

JOEY PARSONS

CEO @ EFFX.COM

Service Ownership can mean a lot of things in a growing engineering organization. The advent of microservices has made it more critical to get right. In this talk, we'll talk through all the ways your organization can cause operational chaos before you get Service Ownership correct.

Role of Quality Engineers
in SRE

REUBEN RAJAN GEORGE

CLOUD RELIABILITY ARCHITECT

@ ACCENTURE

The role of traditional testers has evolved to validate the resilience of modern applications and infrastructure. During this session, the speaker will share some insights/lessons learned while working helping customers make their quality engineering transformation journey.

This talk would cover the following points:
- Applying observability in testing processes (functional testing, performance testing etc.)
- Automating resilience (chaos) tests along with performance tests

Sleeping with one eye open: Experiences with

production support

QUINTIN BALSDON

EXPERT SOFTWARE ENGINEER

@ ZUHLKE ENGINEERING LTD

In this presentation I would like to talk about my experience as a software production support engineer. For over 7 years I have been supporting software in various degrees and have developed some insights I feel would be helpful to others.

In the talk I will cover the most vital aspects of software production support and include some of the more memorable stories, with the lessons learned.

Software won't make a noise when it breaks

PIYUSH VERMA

CTO @ LAST9.IO

Systems fail, but the real failures are the ones from those we learn nothing. This talk is a tale of few such failures that went right under our noses and what we did to prevent those. The techniques covered range from Heterogenous systems, unordered events, missing correlations, and human errors.

CULTURE

Onboarding Chaos Engineering

KAROLINA RACHWAL

CHAOS ENGINEERING
PRACTICE LEAD @ NUAWARE

Now that stakeholder approval for something as outrageous sounding as "chaos engineering" has been granted, a new challenge arises. This talk is a high-level guide on how to prepare and what to expect from the first months of (deliberate) chaos in your organisation. In other words, the “before, during and after” of making chaos successful.

Postmortems, Continuous Learning and Enabling Blameless Culture

JULIE GUNDERSON

ADVOCATING DEVOPS

@ PAGERDUTY

So you’ve had an incident. Restoring service is just the first step—your team should also be prepared to learn from incidents and outages. In this talk you will learn some best practices around postmortems/post incident reviews to help your team and organization see incidents as a learning opportunity and not just a disruption in service. In this workshop, attendees will:

* Get an overview of blameless postmortems
* Learn techniques for effective information sharing
* Learn how to run a postmortem meeting effectively
* Understand the difference between “blame” and “accountability”

Embracing
the Fear of Failure

HUMAIRA AHMED

CEO @ LOCELLE

The fear of failure and rejection holds so many of us back from achieving our ambitions in life. Especially for women, the fear of failure and rejection becomes ever more so prevalent as we struggle not only against our personal insecurities but also against the pressure put onto us by society.

 

Growing up in Pakistan, Humaira shares her background in pursuing higher education as she faced social and family expectations to get married at a young age.

 

Humaira shares what drove her to achieve more and how she confronted her own fears of failure and rejection to secure her future in completing university, working in tech, and eventually starting Locelle with hopes to inspire and empower women to step into leadership and in overcoming their fears for success in their careers.

Incident Ready: How to Chaos Engineer Your Incident Response Process

ROBERT ROSS

CEO @ FIREHYDRANT.COM

Imagine pulling the fire alarm on your team and throwing them into an incident response process they haven't prepared for.
That's probably not too hard to imagine because it happens all too often. This talk dives into practical ways to use chaos engineering to stress test your incident response.

We’re pretty sure using a real incident to test a new response process is not the best idea. So, how do you test your process ahead of time?

Blameless

Postmortem Culture

PRANJAL DEO

ENGINEERING PROGRAM MANAGER @ GOOGLE (SRE)

Psychological safety has been identified as the topmost feature of a successful and innovative organization. At the same time, we need to learn from failure and prevent recurrence of mistakes.


These two practices seem to contradict each other, but is there a way to achieve them both?

Creating a learning culture

AMIR SHAKED

SENIOR VP R&D @ PERIMETERX

Building and marinating a five 9s system isn’t just about the tools and technologies. Development culture has a big part in how you keep a system available while scaling it up and supporting more features, users, and locations.


A healthy learning culture, supporting the development, not repairing mistakes, and identifying weak points is another tool in the engineering toolbox.
In this talk, we will discuss how to create a learning culture using debriefs, what to avoid, and how to instill change in an engineering organization.

Día de los Muertos - Postmortems

that save lives

FABRICIO BUZETO

CTO @ BXBLUE

Postmortems are a well-established way to document the history of a project, especially when things break or don't go as planned. Most teams have a hard time keeping up with it. Among those who do, to get value out of it is also another challenge.

Let me share how my team brought postmortems as part of our process. By not only making it a must-have when handling emergencies but also celebrating it we transformed it as a tool for team bonding. Also, bringing postmortems as an onboarding tool to newcomers. To finish, I'll share two occasions where our postmortems helped us avoid issues and paid themselves.

 
 
 
 
 

BECOME SPONSOR

or media partner

GOLD SPONSOR

SILVER SPONSORS

MEDIA PARTNERS