Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello, this is Ani currently working as a technical team at Cox.
Today I'll be talking about securing enterprise ai and some of the different
concepts include including and how do we secure our enterprise ai,
which includes the governance risk and economics of L Element option.
Let's take a look into the the enterprise AI real check.
So what's the promise and what's the challenge?
So large language models are currently transforming how
enterprises are trying to operate.
And it also includes like promising, enhanced productivity, intelligence,
automation, and competitive advantages.
And our organizations across industries are raising to integrate AI
capabilities into their core systems.
And what are some of the challenges?
Most LMS are failing to reach production because there's like a
significant gap between the proof of concept and the production deployment,
which actually remains wide.
So some of the different things include that are like infrastructure
complexities compliance governance.
And security mis creating like far middle variable barrier for the ops teams.
Let's take a look at what most, why most initiatives are failing.
The first one is like infrastructure gaps.
So currently there's like a lack of scalable and secure infrastructure
designed specifically for AI workloads whereas the old traditional systems
are struggling with the compute.
And also failing to meet these storage requirements.
The second one is like governance vacuum.
So there's like a absence of clear policies of AI model selection and how
the data should be handled and what are the different the decision making
accountabilities and the compliance framework lags behind the ai qp.
There's a not well-defined compliance requirements that
are defined in the AI space.
And the third one is life security environments.
The security teams are basically unprepared for AI specific threats
which includes prompt injection model poisoning and data exfiltration through
exfiltration through the model outputs.
And the last one is like cost uncertainty.
Which includes like unpredictable expenses and unclear written non investment
makes it difficult to justify continued investment in scale beyond initial pilots.
Because with the with the AI systems, there's a lot of uncertainty in the
token usage and which, which drives towards like the unpredictable expenses
and the, and also unclear investment.
Let's take a look at the three pillar frameworks for secure and deployment.
The first one is like strategical infrastructure.
Which is which is a pretty, one of the major pillar for a CQ and LM deployment.
This particular chooses choosing the right deployment model is pretty key.
There are like three different deployment models, which is like
infrastructure export software as a service, and, platform as a service.
Each one offers different different capabilities and control compliance
requirements and also speed to market.
The second one is like economical analysis.
A rigorous certain total cost of ownership models that accounts for.
Compute costs, licensing operation overhead and performance
optimizations across AI lifecycle.
And the last one is like operational safeguards including like good practices
and gateway architecture for for enforcing like security, politics and
policies, monitoring behaviors and preventing cost of failures in production.
So let's look at the deployment models, starting with infrastructure service.
So with the infrastructure as a service, we the company has
like full control over the model selection, fine tuning optimizations
infrastructure configuration, which in which includes like provisioning
the infrastructure training the model.
During the model this is highly, this is ideal for like highly regulated
environments, which has like a really specific compliance requirements.
Some of the trade off of this approach is high operation burdens long time
to modern because the company's responsible for maintaining the
infrastructure of fine tuning the models.
The entire, the operation life cycle.
And this also requires like deep machine learning expertise and
a significant ops investment.
The second one is like platform as a service.
This is a balance between owning like the entire infrastructure because we are using
like a managed infrastructure approach with flexibility of the customization.
So you we are relying on a specific vendor for the infrastructure
piece, but we still own the.
Majority of the model development, fine tuning and optimization
piece, and some of the trade off.
Include like vendor lockin unlimited infrastructure care customizations,
because we are relying on a vendor and those vendors might not support our
requirements of the infrastructure, but this approach dramatically reduces
the overall operation complexity.
And the third one is like software as a service which includes which represents
like pre-built solutions with minimal setup, which is essentially our best
example is like Amazon Bit Rock.
In a perfect pool for standard use cases and rapid expectation
that the minimal technical over.
So we just use the AI software as a service.
And some of the trade offs includes like least control over model behavior
because since we are relying on the service, we don't have much control over
like how the model should be behaving.
And it also has a potential compliance concern because we'll
be relying on the third party.
Service large language models and which includes like limited customizations
for domain specific needs because there's like very little customization
that we can do if you are choosing the software as a service approach.
Let's take a look at five years total cost of ownership analysis.
The first one is like infrastructure code as I mentioned.
Compute resource storage, networking, and scaling requirements.
If we are using infrastructure as a service, this is one of the major
total cost of ownership analysis that we have to perform because
we'll be owning the compute.
We own the storage and we need to tweak the network configurations
and the scaling requirements.
The second one is like licensing and model, which is common
for all the three approaches.
API modeling model, licensing, and hosting increase.
The third one is like operation overhead.
Every, any system, any software system, needs to do a regress analysis
on the operation overhead, which includes like monitoring, maintenance.
Security and compliance.
The third one is talent and training.
Specialized ai, machine learning engineering, and also needs like
continuous improvement and ongoing education for the entire teams.
And the la and the last one is like optimization outputs, which is a key
because continuous improvement, fine tuning and efficiency game because the
overall AI system should be performed.
It should be optimized and should be continuously fine
tuned for efficiency gains.
Let's take a look at some of the performance metrics that
drives the demand investment.
The first one is like first open latency.
So which is nothing but the time until the model begins responding.
This is pretty critical for user experience, especially in our
interactive web application, like a chat bot because low latency always
drives like higher adoption and also good customer satisfaction.
And the second one is tokens per second.
Throughput capacity determines how many concurrent users you can serve.
So the higher the throughput reduce infrastructure needs
and improves fast efficiency.
So we need to provision like infrastructure that satisfies
like higher throughput needs.
So that, that gives us, good investment so that we are not spending
a lot on infrastructure, but still approaching a lot of user requests.
And the last one is like utilization rates.
Percentage of provision capacity, actively service request requests.
So optimizing utilization is the fastest path of improving the investments on
element infrastructure investment.
So let's take a look at the LL Mops Operational Excellency for ai.
So the first one is obviously the development which includes
prompt engineering, model selection and fine tuning.
And part of the prompt tuning.
We need to define like a a well crafted prompt, different using
like different prompt engineering technologies to efficiently.
Pass the contextual data to the l and m and the second one is model selection.
Using the right model for the right use cases is pretty essential because if
the use cases, text-based communication.
We use the right model if the use case about generating videos or
text and we need to select the right models or side the use cases.
And the third one in the development phase is like fine tuning our overall fine,
tuning the system for better performance, better output is pretty essential.
The second one is testing.
It's advers testing, bias testing and performance validation.
Since LLMs are non-deterministic, testing is one of the key part of,
for the operational excellence for the ai, which includes like the
bias detection early in the develop, early in the phase the better.
And we need to do a really good performance validation
of the overall system.
The third one is like deployment.
We need to do like a controlled rollout in include like AV testing
capabilities and can, because let's say we are are adding like a new feature
by doing like a controlled rollout, we can get like feedback immediately.
If there's like something goes wrong, it doesn't impact
the or the entire user group.
And the next one is like monitoring having like real time
observability around the AI systems.
Is pretty essential.
And also having a good cost tracking mechanism and also the quality metrics
because we can use this cost tracking and quality metrics data and do like
an optimization to the system and identify if the cost is getting overrun
we should, we can be able to quickly identify the places where the using
like this cost tracking system and if we have to improve the performance
during the performance optimization.
Getting this quality metrics in this monitoring phase is pretty essential.
For overall optimization, performance tuning and cost
reduction this is of our system.
The next one is like AI, GPI Gateway.
Our secure our security control panel.
The first one is like authentication authorization, which is the key security
metric that we need to be implemented at the AI API gateway, which includes
like verifying the identity and enfor enforcing, like role-based control access.
The second one is like great limiting in corners.
Implementing like a token based throttling prevents cost over
and abuse of the AI system.
Since the AI systems mainly use like tokens for input and output we can
definitely should definitely implement like some sort of rate limiting or ing
at the API gateway gateway level so that.
As I mentioned, it doesn't overrun our cost.
The next one is input sanitization, which includes like detecting and
prompting blocking, like prompting injection at the ai, a gateway level.
And the next one is output filtering which includes like scanning the responses
for data and redacting this instant data before returning to this customer.
And the last one is like semantic c anti caching which is one of the
important aspect of this AI a p gateway.
So that we don't this which reduces the c cost by caching like similar queries.
So we rely on this cache.
To get answers for similar queries.
And if if you're not able to find the in the query, then we'll have the system
which improves the overall optimized optimiz optimization of the system.
Now let's take a look at the critical security risk in operation with prompt
injection attack data, exfiltration, halalization, risk and cost overall.
Let's start with like prompt injection attacks because every system there
could be like malicious users that are trying to craft and put that
manipulates the overall model behavior.
And also bypassing like the security control controls and also extracting
like the training data of our model.
So implementing like a gateway level filtering and input
validations are essential.
Now to mitigate this prompt I detection attack.
The second one is the data exfiltration.
So the, some models may independently expose instrument information from the
training data or on the context window.
So output filtering and access control, limited exposure.
So we can implement both of these using our AI API gateway
that we previously discussed.
And the third one is like ization risk.
So model generates, incorrect information create liability
in the high stakes demand.
So because AI could definitely make some mistakes.
So considering like scoring and human in the loop validation will will mitigate
the risk where the stakes are high.
The last one is like course cost overrun.
Can rapidly escalate costs because we can quickly get into this loop of hall
and thinking continuously in the loop.
And there could be like some some malicious users who are like
constantly hitting our system.
So we need to implement some sort of like a talk base rate limiting and budgeting
alerts which prevents like financial surprises so that the caution shouldn't
be a surprise for the organization.
We should have good visibility into the cost usage and also good control over.
Let's take a look at domain specific models for regulatory industries.
Let's say for example, like financial services and healthcare.
They are like highly regulated and compliance regulated industries.
So we need to have domain specific models that specifically addresses the use
cases in these regulatory industries.
Especially Bloomberg, GBT and Med pal.
So these are like the financial services in the healthcare industries.
So these models are specifically trained on this financial
data and healthcare models.
So that these are volunt industries.
So using domain specific models for regulatory industry use cases
is one, one of the key essential.
When you're building like an AI system.
Now let's take a look at the best practices for production LLM deployment.
The first one is like in implementing ADVERS testing as we already
discussed regular, regularly testing models with malicious input and
also the each cases to identify vulnerability before attackers two.
So consider all possible cases in the testing scenario is one of the important,
production production ready system.
And the second one is like monitoring continuously.
We need to definitely track the performance metrics, the cost and
security events in the real time with automated alerting so that
we need to have these strict alert conditions around the performance
metrics cost and security events so that we should be able to identify
these issues sooner valuable later.
Next one is deploying semantic caching, as we discussed earlier, build like
a cast responses for similar carry queries, which has a performance
improvement reducing the cost and overall without sacrificing the quality.
The next one is like motion controlling everything.
Trading prompt configuration policies as a code with proper version, country
version controlling so that we can roll back these capabilities quickly
instead of doing like a hard fix.
The next one is like enforcing focus and rate limiting.
As we discussed earlier at the AI API gateway level, we need
to implement like a token based or a quota based rate limiting.
For cost controlling and preventing any abusive users from hitting our system.
So by, by implementing token lipids per user per time period with
intelligent court management system.
Implement for next one is like implement fallback strategies which
is design graceful degradation.
Then models are unavailable and produce low quality outputs.
So let's say we rely on a model which generates like a really good
output based on the user scenario.
If, for example, that specific model is unavailable.
So we need to fall back on using a low confidence output model so that we
generate an output to the customer instead of failing the entire user request.
The next one is establishing human in the loop workflow, which is very
critical decision requires a human oversight, particularly in regulated
industries and high stakes scenarios.
Having a human in the loop will be very essential and critical.
Document decision trials maintaining quality logs for the modern
decision for compliance, debugging, and continuous improvements.
We can make, we can maintain these decision trials and capture all
the model decision for compliance.
All the debug information which we can use that information for continuous
improvement for the overall NL system that we are trying to build.
The future Multimodel and agent DKI.
So the next evolution of enterprise AI moves beyond just text only
models but to a multimodal system that processes images, audio, video,
and text simulation simulations.
Agent DKI systems are, can auto autonomously plan, execute, task,
and interact with the expert tools.
So the currently we are in the text.
These are like the different.
Stages of this ai.
The one, the first one is like text only lms.
The current generation is efficiently using the LMS for text generation.
The next future is like multi models which has the capabilities
of processing like multimedia data.
And the last one is like building this systems building this autonomous system,
which has the planning, execution and adaptable capabilities built into it.
The overall goal, the DevOps systems must be prepared for disadvantages by
building flexible secure infrastructure that can accommodate rapid AI evaluation.
But while maintaining governance and really good compliance standards
from tools to the ecosystem are the integrated integrative.
Successful enterprises are moving beyond the streaming LLM as an isolated
tool, and instead integrating these LLM tools and these governance ecosystems
within their DevOps pipelines.
This means embedding AI directly into the development workflow, security
scanning, instant response and operation modeling with consistent governance and
operation across all the touch points.
So including AI in our day-to-day.
Entire project development type cycle is pretty essential.
Starting from the development phase building like a really good
development workflow using AI for our security scanning instant response
and also like operation monitoring.
But.
Including really good consistent governance and observability
across all these touch points.
And some of the key integration points are like CICD pipeline
for automation code review.
We can definitely leverage AI for automated code review, which can
be included as part of the pipeline and also as part of the security
scans and instant response system for inter ingen triage remediation.
We can definitely train a specific AI models.
With some proprietary incident and the response data so that we can build
this instant response system that could intelligently triage and remediate
any instance that we are occurring.
The next one's monitoring platforms and anomaly detection and root cause analysis.
We can definitely leverage ai for root cause analysis and root cause detection.
Which, but.
Which we can definitely leverage AI for faster.
Identifying the an anomalies and also identifying the root causes
and documentation system or automated knowledge base generation.
We can definitely use ai for billing, like really good knowledge base and
train a specific model based off of the appropriate knowledge based information.
So this is like from the tools, the ecosystem, and integrating AI in the
entire project development lifecycle.
From the development phase to the security scanning, instant response and operation
monitoring and also anomaly detection root cause analysis, and also building
the really good knowledge base system and integrating automated code review,
security scanning into our CLCD pipelines.
Thank you so much.