Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello everyone myself, Malika.
I am a software engineer working in Barlay as a business inte developer
mainly focusing on building AI power solutions that drive Smart Edition making.
So my tech stack is using tech clicks, clicksense, Tableau,
sql data models and cloud-based AI ML integration using Python.
So mainly focusing on turning complex data into the actionable insights.
So I'm here to discuss on the ML ops for the scalable BA and real time
analytics, like it's a comprehensive guide to building, deploying, and
managing production grade machine learning system and financial services.
So today agenda is, we are focusing on five stages.
So one is first one, like here.
We'll start the fundamentals of ML ops in financial services.
Then walk through the ML lifecycle from injection to the deployment.
After that, we'll look at the monitoring and governance and finally how all
of these currents with BI tools that decision making make us use every day.
So we'll go through one by one.
Yeah.
Next one is the ML of maturity gap in financial services, where you can
see the 83% of ML models in finance never make it to the production.
That number should scare as well because it means billions spent on talent
and platform without business value.
The reason of familiar complex requirement that slow down the deployment
legacy system that resist integration and the real time performance needs
that most pipeline can't handle.
The big insight here.
The bot link is in model development.
It's everything that happens after.
From building models to managing them.
The reason is clear, it's not enough to develop a great algorithm.
Financial banks need framework that automate auditor has integrate
models with legacy systems and monitoring performance in real time.
This is where ML ops comes in.
Think of it as a bridge between model development and business impact.
Automating the pipeline that handle deployment, logging, and compliance check.
Whereas next stage would be the versioning and monitoring tools
so that it makes sure every model update is stress and if suddenly
risk reviews that used to take weeks.
It can happen in days or even hours.
The point is when you solve.
After problem, all the investment in talent and technology.
Finally start paying off model, move from prototype to production, and
business teams actually see the results.
That's how we close the gap between building something
amazing and making it useful.
Yeah.
Next slide would be the financial MI cycle.
So it has a multiple stages.
So each stage has its financial twist coming to the first
stage of data injection.
So it is not just pulling the CSV files you are streaming marketing data injecting
payment data, and even using the alternate data sets for that, whereas coming to
the next stage, featuring engineering which is a critical for risk signals,
things like violating windows credit utilization ratios, or liquidity metrics.
And coming to the next stage model training which needs versioning
and para parameter tracking, but also regulatory documentation.
Every DEC decision needs to be expandable.
Coming to the next stage is deployment, so model might be, must be ized,
deployed via continuous integration or continuous deployment, and often
tested with the required frameworks.
And last stage is monitoring.
This is this includes a drift performance metrics and compliance,
auditing and financial impact.
So each stage has unique demand in finance, especially in the documentation.
And sable.
Coming to the next slide which is a tech stack for ML ops, for the fine tech, so
we can, I can go with the breakdown for the ML ops, like data engineering and
the management would be the first task.
Stage one, like data pipelines, like tools like Apache Airflow and Apache
Spark Q Flow help build building the automated data pipeline that handle
data extraction transformation, and loading the ETL to ensure high
quality input Strat for the models.
And within the same area, data versioning and tracking tools like DVC data
version control enable teams to track data sets, change changes across the
experiment, and making it easier to reproduce and debug modules and another.
Another thing is like data quality monitoring, like ensuring data
quality through validation.
Tools like monitoring prop helps detect issues such as missing values
or anomalies that could degrade model performance, which comes
under the data quality monitoring.
Whereas like second the other other module is model experimental and
versioning, which is nothing but experiment tracking like ML flow and.
ML Flow are used to log model hyper parameters, metrics, and
configuration, and allowing those data scientists to systematically
compare experiment for comparison.
We so for the comparison we use neptune.ai and within the same
area, the model versioning are stored in registry like ML flow.
Model ing, which documents the metadata training data training
data and performance metrics for eg versioning and tracking the deployment.
And coming to the deployment.
So deployment with the CSCD pipeline, like using automated module testing
and validating and ensuring that all the models are thoroughly
evaluated before the deployment.
This streamline streamlines the continuous integration and deployment
of the new models and the scalable deployment, like continuation with
Docker and orchestration using.
Es enable, enable the flexible and scalable model deployments, which
is nothing but adapting the resource to the real time domain mind.
And using the Bento ml and Amazon SAG Maker is a fully managed
cloud-based machine learning that provides tools and services to
build, train, and apply ML models.
And next stage would be the monitoring and maintenance like.
We are using evidently, AI for drift detections and for the
metric collections as well.
Yeah.
Next slide would be featuring engineering in financial data, whereas like some.
Financial ml require specialization, feature engineering to capture temporal
pattern market and the risk factors where coming to the temporal patterns in
finance, the timing of events is critical.
Generic features like average price don't capture the market first, but you want
to feature that reflect how price and volume evaluate over the time, right?
Let's say, example, let's smooth out short term fluctuations
and high highlighting trends.
So short term versus long term and make and signal movement to challenges.
So we coming to the market, MicroStrategy future, this is nothing
but this bit spread the difference between buying and selling the price.
Which comes within the same area, like depth imbalance.
Transaction cost and sleep level.
Coming to the next slide CSCD for the financial MI models, which is nothing
but continuous integration and continuous deployment, whereas coming to the data
validation, every new data must be passed.
Quality and bias check.
Next stage would be data model training.
Training pipeline generates the documentation and risk
reports automatically.
And next stage would be the testing and staging that can read department,
reduce risk before full rollout re regulatory gates, which is the next stage.
Like compliance reviews are integrated in the pipeline.
So approval rt, manual bottlenecks.
CSCD isn't just a DevOps idea, it's a governor frameworks in financial mi.
So next day next slide would be the real time interface architecture.
This is where ML ops made systems systems engineering, whereas low
latency use cases like fraud detection, algo trading, instant credit scoring.
If you are too slow, either fraud slips through, a legitimate
customer gets declined.
So come with tech stack tech stack solutions like optimization model or
GPU acceleration and streaming platform like Kafka something sometimes even
edge deployments, which takes a matter.
So the other thing is like scalability pattern.
To handle market payments to wide variety, you need auto-scaling clusters and a
load balancing and ation framework to keep core services Stable in finance and
performance is not just nice to have, it's must that it should deal with it.
And next would be the complaints and governance for the FinTech.
Okay governance is the foundation like model cards that compel with SR 11 by
17 and exp expandability report using shape and line, so washing control
violation, artifacts for audit, you need an immutable logs reportable environment,
and deploying deployment issues.
Security also matters like encrypting a pipeline component and using a strict
role based contact access and scanning container without these like you can't
have mls so you have a risk exposure here.
And monitoring financial ML in production.
There are a few steps to track down in production for monitoring
the machine learning so that track is nothing but track data and
prediction drift over the time.
Your model input data and its prediction can shift leading to decreased accuracy.
So monitoring these trips helps identifying when retraining is necessary.
Watch for the training service crew ensure that the data used during training closely
matches the data and the production.
Significant decreases differences can cause the model to perform poorly.
Like next would be the monitoring data pipeline health.
Issues in data process can introduce errors or delays affecting data model's
performance, keeping an eye on this pipeline, ensure smooth operation.
The other would be evaluating model performance continuously and which
is nothing but back testing and other metrics to access how we model the
performing the real time conditions.
So this helps in making information about updates or re retraining.
Finally the setup setup the alarm implementing alert mechanism to quickly
detect and respond to any unexpected changes in model behavior performance.
So by focusing on these areas you can maintain the ability and
effectiveness of your mission learning models and production.
Coming to the next slide, like concept driven.
Drift management in financial model.
Drift management in financial model is about detecting changes in data
and relationship that can degrade model performance all the time.
So it involves monitoring, input, features and model predictions.
Using metrics like PSA or kl ence to flagships when drift is detected.
Models are recalibrated or retrained on recent data to maintain accuracy.
So continuous monitoring and governance ensure the model stays reliable in
dynamic market or credit environment.
So next slide would be, the integrating ml with the BI tools in finance, like
integration approaches or like a PS for the model prediction and endpoints
and embedded Python are for in dash com dashboard computation and pre-computer
feature stores for the low latency access.
Custom visualization extensions for the model insights.
To these capabilities that suppose the BI platform is Tableau
or bi looker and ThoughtSpot.
Whereas there's process for that integrating the ML with the BI
tools and finances pretty predicate to analytics like ML algorithm
analysis, historical data to forecast.
Future trends such as sales or customer's behavior and enabling the
proactive decision making and also automated the insight, insights and
enhancing the forecasting and detection and personalized recommendation.
So that would be the integrating the BI tools with the mi.
Next would be the NLP Dr.
Driven Financial Analytics.
Which is nothing but ai.
Added it course like NLP technologies built on the interaction of computer
science, artificial intelligence.
And so main core components of NP systems are like.
Tokenization passing and machine learning.
Which is nothing but breakdown text into the different
individual words or pa passes.
Allowing systems to analyze smaller part of a sentence.
Such words are shorter basis is nothing but a to tokenization.
And the next would be the passing.
That understanding the grammatical structure helps system integrator
relationship between words.
And finally, the NLP model can generate more accurately by using the
machine learning outputs, by training with vast amount of text data, while
additionally data can enhance performance.
Model also need careful monitoring to avoid the degradation over the time.
So machine learning uses an algorithm to detect pattern.
Text enabling the NLP system to identify and categorize data efficiency.
This approach allows models to recognize recurring themes, sentiments,
and entities within the text data.
And key takeaways are like.
Start with the, when we design something, like start with the architecture, like
designing machine, learning of pipeline with the specific requirement in mind
from day one and integrate early, connect the ML capabilities directly into the
existing BI tools where users already trust that tool and make it governance.
And com compliance isn't an, and afterthought, I built
it into your workflows.
Yeah, that's all.
Thank you.