Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello everyone, and welcome to my session on automating AI ethics for
ML Ops Native Data Governance for production ready generative AI pipelines.
My name is Nu Mariela, and over the next 20 minutes I'll walk you
through the governance challenges that enterprises face today with
the generative AI and how ML lops native approaches can turn compliance
from a bottleneck into an advantage.
A little bit about myself.
I specialize in ML ops, data governance and AI ethics.
My focus is on building responsible, scalable AI systems across platforms
like AWS, Azure GCP, and Snowflake.
Over the years I've worked on deploying enterprise AI solutions that not only meet
the performance goal, but also add the air to strict regulatory requirements.
My mission is to transform governance from being a hurdle into a strategic asset.
Now, when we think about governance in ai, organizations often face three different
trade offs, speed versus complaints.
Now, how do we deploy fast without the checks tracking challenges?
Now, data provenance gets harder as data sets scale.
Now regulatory burden navigating multiple jurisdictions without slowing innovations.
Now this can create a tension like should we move fast or stay compliant?
The answer, of course, should be both.
Now, introducing the TDED framework, which means there is exactly where
the training data declarations or the TDD framework comes in.
So unlike traditional governance, TDD is embedded directly into the CICD pipelines.
Now that means compliance monitoring happens automatically in real time
without slowing down deployments or requiring major infrastructure changes.
Now the five different features of the TDD framework are a four tier
classification system that can categorize data by risk and provenance.
Number two is the standard metadata schema.
So every system speaks the same governance language.
Number three is real-time risk assessment.
Now evaluating risk continuously during the training and deployment.
Number four is comprehensive audit trails, which can leave the regulators
with a complete compliance regard.
Now together these make the governance scalable and frictionless.
Now in practice, we've implemented these approaches across AWS
Azure, GCP, and Snowflake.
Some of the strategies included container native governance tools or
automated policy enforcements within the Kubernetes platform and monitoring the
dashboards for real-time verification.
The key is that governance travels with a model wherever it's deployed now.
Different regions has different rules.
Now, TDD framework supports jurisdiction specific automation.
It can adapt to regional compliance needs, manage cross border data
flows, and ensure scalability without sacrificing the compliance.
Now, this is especially important for global enterprises that operate across
multiple regulatory environments.
Now one big governance gap is the licensing part.
Now many training assets or data sets that included copyrighted
or ambiguously licensed material.
With TDD, we add automated license detection, risk
evaluation, and also remediation.
And this happens all before deployment.
And this turns what could be a legal nightmare into a manageable process.
So I want to leave you this session with three practical takeaways.
Number one, ready to deploy code samples.
You can adapt right away architectural patterns for integrating governance
into existing ML ops workflows.
And last, the velocity enhancing strategy.
Now governance that increases speed can rather has or than slowing it down.
So this is exactly governance.
It's not about reducing the risk.
If it done right, it actually accelerates the innovation now
by removing the uncertainty and making all the rules clear.
You free teams that can innovate without fear for non-compliance,
and that's how compliance becomes a competitive advantage.
Now let's take a closer look at how TDD works in practice.
Now, the four tier classification system helps us categorize data
sets based on the risk level from a highly trusted to restricted sources.
This enables precise management of sensitive data and
tailored governance controls.
And in addition to classification, we use standardized metadata schema
so that all data sets carry uniform governance information such as
licensing and transformation history.
Now this standardization ensures that consistent compliance
tracking across all the systems.
Now container native governance tools also allow policies and compliance checks to
travel with workloads, ensuring consistent governance, whether models run on premise.
In hybrid setups or across multiple clouds, and there's also Kubernetes
based enforcement that can completely automate all the compliance
verification at multiple stages before, during, and after deployment.
That can completely minimize manual intervention and also
reducing risk of human error.
And dashboards play a very critical part.
It can provide continuous compliance verification with
near instant validation feedback.
Now this can enable teams to monitor governance status in real time and
also quickly address any issues and talking about the real time the
real time risk assessment tools, continuously monitor data and model risk.
From ingestion through runtime making governance, proactive by identifying
the potential compliance gaps before they even become problems.
Now, let me explain a case study in a financial services.
So a major financial institution wanted to deploy generative AI for customer service,
but it faced very strict regulations.
By implementing TDD with industry specific governance rules and compliance
checks, they cut compliance review time by 87%, while also maintaining the a
hundred percent regulatory adherence.
That's governance automation in action.
Now, when it comes to cross cloud implementation, TDD is very flexible
across all the cloud providers on AWS.
It uses S3 Lambda and SageMaker on Azure.
It uses blob storage functions and machine learning on GCP.
It uses cloud storage functions and Vertex ai and with snowflake governance
extends through native procedures.
So regardless of your cloud strategy, TDD can be integrated.
Now here's a small snippet of the Python for automated license detection.
It shows how TDD scans, directories, validates policies, and also
flags the risks before deployment.
This is just one example of how easy it is to bring governance into
your workflows programmatically.
Now finally, here's how TDD fits into the ML ops pipeline at ingestion.
The metadata collection actually starts during the pre-processing stage.
That is exactly where the governance checks run.
In training, the lineage is tracked from end-to-end before
validation and deployment.
The compliance is verified.
And finally in production monitoring ensures ongoing complaints.
So this way, governance is built in end-to-end.
So finally, to wrap up, governance doesn't have to be a burden.
With a training data declarations, framework, enterprises can truly
accelerate deployment, reduce legal and regulatory risks, build
trust with the stakeholders.
And gain a sustainable competitive advantage.
Thank you for joining me today.
Feel please feel free to connect me on LinkedIn.
I would like to continue the conversations.
Thank.