Building reliable computer systems has never been that easy!
Reliably deploying and maintaining machine learning applications is complex. There's a dizzying array of tools and they look different from the usual DevOps tools.
To apply SRE skils to ML, we need to understand the specific challenges of ML build-deploy-monitor workflows. We'll use reference examples to understand the cycle in terms of data prep, training, rollout and monitoring. We'll see that some key challenges relate to training models from slices of large and varying data domains - a problem alien to the mainstream DevOps world.
Priority access to all content
Community Discord
Exclusive promotions and giveaways