Transcript
This transcript was autogenerated. To make changes, submit a PR.
Good morning and good afternoon.
Thank you Con 42 and the organizers for providing me the amazing opportunity.
And let me start with a brief background of the problem that you are going
to talk about and how to solve that.
So in the biomedical research field, there are more than a million of.
Research papers are being published every year, and if you take the citations of
those research papers it'll multifold such tsunami of information is out there.
When a researcher comes to do a research in that field, they first
look for doing the literature review.
That's the first step in any literature any research process.
When they attempt to do that, they have to go through these millions of records.
Probably they can do a quick search in somewhere, but in spite of that, finding
the relevant literature is very difficult.
First of all, there may be a good literature out there.
That may be hidden from the researcher search.
That's also not good.
There may be processing issues like if they take into these millions of
records into account it will take forever to do that literature review.
So in sum up I would say the average, the researchers spend like a total of
eight, more than 80 percentage of time just for doing this literature review.
So this is the clear problem statement.
This slide we already discussed about it, and our solution for this problem
is intelligent AI literature agents.
It contains four, four
subdomains, like domain adapted lms.
That's LLM is large language model, advanced rag system.
It is.
Advanced retrieval augmented generation system, precision, NER System.
N-E-S-N-E-R is nothing but entity recognition for genes, proteins, and drug
directions, robust ML ops infrastructure.
Let's go through this agenda.
We'll first start, talk about talking about the ML ops architecture.
And then we'll move on to biomedical NLP Pipeline.
We'll also talk about vector database scaling after that and following
followed by monitoring and observability.
And finally, we'll close with the continuous experimentation
and deployment strategies,
a ML Ops architecture overview.
So when we talk about ML lops everyone might have cared about this.
ML lops.
It's a machine learning how to ma operate, operationalize
the machine learning pipeline.
So it's, ML ops is is a critical part.
Like just by developing the models, we cannot achieve everything.
We have to implement it and make it production ready.
ML ops is a crucial role in any machine learning lifecycle.
So effective lops, if you consider that, if you ask me to define an
effective lops architecture I would say that few components are necessary.
The first one is contentized microservices.
It's for modular scalability and version versioning.
This model registry.
So models keep getting updated, so we need to have it versioned in
the model registry, a synchronous processing pipelines for efficient
batch citation updates because in research company is very vibrant the
citation results and and citation count, those things will vary every day.
So we need to get the latest information.
To give the correct picture of the literature to the researcher, clear
separation of embedding generation from retrieval services real time
inference, AI A PA layer, engineered for subsecond responses and dedicated
evaluation enrollment to ensure rigorous scientific validation.
So when we look at the biomedical NLP pipeline there are a few
challenges with the domain adaption.
Because biomedical is something which only few people who are really
into the bioTE, biotechnology and bio biology and healthcare and those
people are interested in, and it has lots of jargons and technical things.
And not all of them will work on the can understand this biomedical,
vocabularies, even the language and, the researchers out there.
So it processes it, it poses sorry.
It poses some challenges because the biomedical language it's complex.
It's because it's highly specialized vocabulary across sub-disciplines.
Entity relationship, recurring domain expertise like gene protein interactions
that needs a domain expert to, give good understanding to the researcher.
Contextual meaning that changes across research areas need for continual
updates as a new discoveries emerge.
Traditional NLP five plants failed in, in this domain without
specialized adaption techniques.
So as a machine learning engineer or somebody who does the NLP on this
biomedical data, not everyone can do that.
They should have certain knowledge about this domain and the languages
and certain level of understanding.
Model training and deployment over workflow.
So model training, it starts with data ingestion and cleaning and
doing the exploratory data analysis.
Those are typical steps in any machine learning.
And once model is you do all those things, you'll develop the model
and, you will have it checked with the test set and validation set,
and you'll have a working model.
So imagine that it all happens and data ingestion happens.
Okay?
And then you will go and training the pipeline, you'll run the training,
the pipeline and evaluation framework.
So we need to have a specialized metrics for biomedical accuracy
with the domain expert review.
And then we finally deploy the models.
Scaling vector databases for 30 plus million citations.
So the operational challenges is like efficiently generating a biddings for
massive document car and balancing recall against computational cost,
managing seamless index updates.
These are like issues with the operational issues with the vector.
Vector embeddings.
And with our ML op solution to encounter that is a synchronous
embedding generation pipelines with batch processing, hierarchical indexing
strategies and then read replicas, which staged updates, optimist query caching.
So these are the techniques to encounter those shortfalls or
challenges like I would say.
Rag architecture and production implementation.
There are lots of key components like domain specific impacting models,
multi-stage retrieval, pipeline citation, network enrichment,
and context window optimization.
And but with respect to performance metrics we have to, evaluate the things
like evaluate any, the performance against baseline, I would say to
make sure that the performance has indeed has improved like subsequent
correl latency 90 percentage accu accuracy in the entity recognition.
And citation relevance score is coming up more than 85%.
As such, KPAs are really important when evaluating performance.
It's not about just doing the things.
We also need to make sure that we are doing the right thing
and getting the right expected results within the stipulated time.
So the final one is, yeah, reduction in the researcher literature searched.
Monitoring and observability framework.
So once we deploy the model, and the data may change, it's evolving.
The model which work efficiently while we implement it, first time
may not work over the period.
So we need to constantly monitor the models and then do the fine tuning.
So model performance tracking, scientific accuracy, validation,
and user interaction analysis.
They're really important and we have to, once we are done with the work,
we cannot be, we cannot say that.
Yeah, it's all done and we don't need to revisit.
We have to come up.
Come back again and we have to have a certain metrics to evaluate every time
and make sure that if it's derailing or maybe deviating from the desired
expected outcome, then we need to intervene and fine tune the model.
Or maybe enhance the ecosystem to, to produce the right output.
Real time monitoring dashboard.
Our custom monitoring solution provides both ML engineers and scientific
stakeholders with visibility into model drift detection, citation accuracy system
health and performance metrics, user query patterns and success rates, resource
utilization and scaling triggers A and B a RB testing, a performance comparison.
Continuous experimentation methodology.
We have implemented AB testing framework that balance experimentation
with scientific reliability, so it enables targeted HO cohort testing.
My research domain measures both technical metrics and research
outcomes and provides statistical confidence for biomedical application.
Supports multivariate testing across model components.
If this approach has yielded 80 percentage reduction in ture search time while
maintaining scientific rigor and results,
infrastructure, scaling patterns, so we have to have a baseline infrastructure
it's core component size of our average load with two times redundancy,
elastic scaling layers, autoscaling inference endpoints based upon query
volume and scheduling, scheduled scaling events predictive scaling for
known search patterns like especially maybe sometime towards academic.
Term then if you observe that sorry.
If you observe that the demand increases, we may have to plan for
the scaling in advance, specialized to compute GPO allocation for embedded
embedding generation and inference.
Our infrastructure scales efficiently across both batch processing needs
and real time query patterns.
So this is really important, while making sure that we are not doing less
or more, we are doing the right thing.
Key lops learnings from biomedical ai.
So we have to, yeah, already we saw that we have the, in
integrating domain expertise.
That's really important.
Without the involvement from domain scientists, it's it's difficult
to do a right lops without the domain experts help developing
scientific validation pipelines.
That's also another important, we cannot just be sure that the performance has
increased just because the time to search.
Has come down.
We have to also measure based upon the other scientific variables and KPIs
enabling continuous corpus updates.
This is really important.
Corpus is increasing, so we have to keep up to date for the latest and greatest
information, prioritizing explainability.
Biomedical research application necess creates significantly higher transparency
and inter, relatively interpretability standards compared to consumer AI system.
And with that, I'm concluding.
And thank you for this opportunity t thanks a lot.