Chaos Engineering in 2021
Engineering Lead @ Bloomberg
+ Generic Mitigations:
The Path to Self-Healing Systems
CTO @ StackPulse
Continous Security and feedback loop in production
CTO @ CERTUS CYBERSECURITY
For a successful implementation of DevSecOps, there is an immediate need to identify application, and infrastructural security gaps in production (e.g. by generating chaos or via dynamic security testing) and remediate them in on going fashion called shift up.
Cyber Chaos Engineering: How to Implement Security without a Blindfold
CTO @ VERICA
& DAVID LAVEZZO
DIRECTOR OF CYBER CHAOS ENGINEERING @ CAPITAL ONE
The complex ordeal of delivering secure and reliable software will continue to become exponentially more difficult unless we begin approaching the craft differently.
Enter Chaos Engineering, but now also for security. Instead of a focus on resilience against service disruptions, the focus is to identify the truth behind our current state security and determine what “normal” operations actually look like when it's put to the test.
The speed, scale, and complex operations within modern systems make them tremendously difficult for humans to mentally model their behavior. Security Chaos Engineering is an emerging practice that is helping engineers and security professionals realign the actual state of operational security and build confidence that it works the way it was intended to.
Leveraging the crowd power to regain faith in Internet’s zero trust architecture
CTO @ CROWDSEC
Worldwide spending on cybersecurity is predicted to reach $1 trillion in 2021. But companies will keep being hacked.
A new approach is needed and the crowd could be the key.
We will explore why a collaborative approach to security could contribute to solving the problem.
Risk-Driven Fault Injection: Security Chaos Engineering for the Fast & Furious
CLOUD SECURITY ENGINEER
The dynamic nature of cloud-native infrastructure requires continuous security mechanisms to effectively detect security threats, especially those with unknown patterns and behavior.
This talk proposes Risk-driven Fault Injection (RDFI) techniques to address these challenges.
Securing the Cloud: Empowering Developers to practice Security Chaos Engineering
YURY NINO ROA
& JHONNATAN GIL CHAVES
@ AVAL DIGITAL LABS
Cloud platforms face security issues that are frequently a matter for security engineers, not for developers. As a result, security is treated as separate from development. Security Chaos Engineering offers a methodology to bring to developers to leverage the power of security in their roles.
Attacking & Defending Mobile Apps
SENIOR SECURITY CONSULTANT
@ APTIVA CORP LLC
The talk aims to teach attendees Android & iOS application security from basic level to advanced.
I will cover architecture, file system, security model, application components, OWASP mobile top 10, Mitigation, toolset, frameworks, techniques used to identify, analyse and exploit vulnerabilities.
When The Network Breaks
TAMMY BRYANT BUTOW
Chaos Engineering is a disciplined approach to identifying failures before they become outages. By proactively testing how a system responds under stress, you can identify and fix failures before they end up in the news. Chaos Engineering lets you compare what you think will happen to what actually happens in your systems.
This talk will share how you can accelerate your understanding of how your network can break (packet loss, blackhole attacks, latency injection, and packet corruption) and impact your services.
proactive optimization of cloud resources
& ALICJA RENIEWICZ
FULL STACK ENGINEER
Novel concept of advanced adaptation of cloud resource using predicted demand for Cloud resources. Presented approach is using advanced forecasting methods combined with machine learning based solvers, which can dynamically adapt to changed workload. The benefits of that approach will be shown.
How to avoid breaking other people's things
LISA KARLIN CURTIS
Unless you know all the assumptions the consumers of your API have made, it’s impossible to reliably avoid breaking their software. Assuming you, like me, can’t read minds, what can we do to try and keep the number of sad developers to a minimum?
Cloud Native Chaos Engineering at scale
CEO @ CHAOS NATIVE
SREs main task is to keep the operations up and running. An SRE dealing with Kubernetes has many challenges to keep resilience is at the desired level and improving over time. In this talk we will go through techniques to measure and improve resilience of Kubernetes platforms in a Cloud-Native way.
Maximizing Error Injection Realism for Chaos Engineering
with System Calls
@ KTH ROYAL INSTITUTE
Some of the perturbation models for chaos engineering are based on a random strategy such as ChaosMonkey. However, realistic perturbations could also come from errors that have naturally happened in production. I would like to share more about how to improve the realism for CE experiments.
A Cloud Architecture for Embracing Failover
What if instead of designing cloud architectures where failover is an exceptional case, we embraced failover as a normal part of running and system and failed over all the time?
Let's deep dive into an architecture currently in production doing just that and share lessons learned along the away.
Sensory Friendly Monitoring: Keeping the Noise Down
As infrastructure increases in complexity and monitoring increases in granularity, engineering teams can be notified about each and every hiccup in each and every server, container, or process. In this talk, I’ll be discussing how we can stay in tune with our systems without tuning out.
Taming the spatio-temporal-causal uncertainty in
Chaos Engineering and Observability
There are 2 challenges in observability. Uncertainty in prognosis decisions (false+ and false- in failure predictions) and discovering causal connections in diagnosis. We address this by modeling spatio-temporal uncertainty for prognosis& knowledge representation/ graph database for causal diagnosis
5-Technology Trends and Opportunities for Start-ups & Fortune 500 Companies
FOUNDER @ BOOMERTECHGROUP
My talk will review hot jobs and skills to acquire. Lastly talk will cover how start-ups and corporate workers can use these technologies.
Technology such as Artificial Intelligence (AI), Internet of Things(IoT), Machine Learning, Software-as-a-Service(SAAS) and Big Data. Talk will review hot jobs and skills to acquire. Lastly talk will cover how start-ups and corporate workers can use these technologies.
Clear the ring for Chaos Engineering at Vertrieb Deutsche Bahn! One year sensations and attractions!
@ CODECENTRIC AG
& OLIVER KRACHT
@ DEUTSCHE BAHN VERTRIEB GMBH
Have you ever been to the circus?
As software developers, we have much more in common with circus trainers than we think. On the one hand, we have to tame and maintain a steadily growing zoo of technologies and on the other hand, the undesirable audience expects us to show more and more astonishing features within a short period of time. On top of that, we’re also shooting right into the circus ring with our big CI/CD cannon. That can lead to interesting effects, because we’re shooting while a show is running.
Let's talk about some experiences, learnings and failures. Your asking about our learning environment? Well there you go: 300+ microservices, 100+ developers, 100+ Gamedays, countless experiments with various outcomes :)
In the kitchen: A sprinkle of fire and chaos
ANA MARGARITA MEDINA
SENIOR CHAOS ENGINEER
How do you ensure your food tastes good? Or maybe you don't and everyone hates your cooking. How do you learn to avoid burning or cutting yourself while cooking? Is the kitchen on fire or is the smoke alarm just complaining that it needs a new battery?
What does all of this have to do with the cloud? Chaos and Learning! She will talk about her favorite ingredient for shipping resilient cloud-native applications.
Learn by fire! Chaos and Learning!
One year of SRE failures
LEAD SRE @ BOL.COM
Last year we pitched SRE to our management team and got the OK to get cracking.
We've achieved a lot, but failed even more.
This talk is a front-row seat to a blameless postmortem on the first year of SRE at bol.com, the largest online retailer in The Netherlands and Belgium.
Organizational Chaos and recipes for Service Ownership
CEO @ EFFX.COM
Service Ownership can mean a lot of things in a growing engineering organization. The advent of microservices has made it more critical to get right. In this talk, we'll talk through all the ways your organization can cause operational chaos before you get Service Ownership correct.
Role of Quality Engineers
REUBEN RAJAN GEORGE
CLOUD RELIABILITY ARCHITECT
The role of traditional testers has evolved to validate the resilience of modern applications and infrastructure. During this session, the speaker will share some insights/lessons learned while working helping customers make their quality engineering transformation journey.
This talk would cover the following points:
- Applying observability in testing processes (functional testing, performance testing etc.)
- Automating resilience (chaos) tests along with performance tests
Sleeping with one eye open: Experiences with
EXPERT SOFTWARE ENGINEER
@ ZUHLKE ENGINEERING LTD
In this presentation I would like to talk about my experience as a software production support engineer. For over 7 years I have been supporting software in various degrees and have developed some insights I feel would be helpful to others.
In the talk I will cover the most vital aspects of software production support and include some of the more memorable stories, with the lessons learned.
Software won't make a noise when it breaks
CTO @ LAST9.IO
Systems fail, but the real failures are the ones from those we learn nothing. This talk is a tale of few such failures that went right under our noses and what we did to prevent those. The techniques covered range from Heterogenous systems, unordered events, missing correlations, and human errors.
Onboarding Chaos Engineering
PRACTICE LEAD @ NUAWARE
Now that stakeholder approval for something as outrageous sounding as "chaos engineering" has been granted, a new challenge arises. This talk is a high-level guide on how to prepare and what to expect from the first months of (deliberate) chaos in your organisation. In other words, the “before, during and after” of making chaos successful.
Postmortems, Continuous Learning and Enabling Blameless Culture
So you’ve had an incident. Restoring service is just the first step—your team should also be prepared to learn from incidents and outages. In this talk you will learn some best practices around postmortems/post incident reviews to help your team and organization see incidents as a learning opportunity and not just a disruption in service. In this workshop, attendees will:
* Get an overview of blameless postmortems
* Learn techniques for effective information sharing
* Learn how to run a postmortem meeting effectively
* Understand the difference between “blame” and “accountability”
the Fear of Failure
CEO @ LOCELLE
The fear of failure and rejection holds so many of us back from achieving our ambitions in life. Especially for women, the fear of failure and rejection becomes ever more so prevalent as we struggle not only against our personal insecurities but also against the pressure put onto us by society.
Growing up in Pakistan, Humaira shares her background in pursuing higher education as she faced social and family expectations to get married at a young age.
Humaira shares what drove her to achieve more and how she confronted her own fears of failure and rejection to secure her future in completing university, working in tech, and eventually starting Locelle with hopes to inspire and empower women to step into leadership and in overcoming their fears for success in their careers.
Incident Ready: How to Chaos Engineer Your Incident Response Process
CEO @ FIREHYDRANT.COM
Imagine pulling the fire alarm on your team and throwing them into an incident response process they haven't prepared for.
That's probably not too hard to imagine because it happens all too often. This talk dives into practical ways to use chaos engineering to stress test your incident response.
We’re pretty sure using a real incident to test a new response process is not the best idea. So, how do you test your process ahead of time?
ENGINEERING PROGRAM MANAGER @ GOOGLE (SRE)
Psychological safety has been identified as the topmost feature of a successful and innovative organization. At the same time, we need to learn from failure and prevent recurrence of mistakes.
These two practices seem to contradict each other, but is there a way to achieve them both?
Creating a learning culture
SENIOR VP R&D @ PERIMETERX
Building and marinating a five 9s system isn’t just about the tools and technologies. Development culture has a big part in how you keep a system available while scaling it up and supporting more features, users, and locations.
A healthy learning culture, supporting the development, not repairing mistakes, and identifying weak points is another tool in the engineering toolbox.
In this talk, we will discuss how to create a learning culture using debriefs, what to avoid, and how to instill change in an engineering organization.
Día de los Muertos - Postmortems
that save lives
CTO @ BXBLUE