Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello everyone.
my name is Mi Panda.
and today, I'll be presenting an interesting topic, which is how should we
think of scaling, ai, with large models.
as you see, since the advent of, charge G PT in November, 2022, there
has been a lot of disruption that is happening in the field of generative ai.
And, we see every week.
We see a new sophisticated models, now key for us and key for some executives,
key for the executives is how can they leverage large model, what the
ecosystems should look like, what the architecture archetypes would look like,
and what the infrastructure should be, to make sure that now you'll be able
to scale the generative AI use cases.
So from my experience, and I have brought a, brought few content, but
important is, what should be the key design principles And to make it very
simple, if we want to enable generative AI use cases at scale, of course the
infrastructure has to be in the cloud that the master to adapt, in any.
And solving so many use cases.
we are not talking about just billing one chat, chat bot kind of use cases.
It's throughout.
Building a whole in a portfolio of generative age cases
across the organizations.
So fast design principle is flexibility.
So making sure that know your design principle, architecture design must
be flexible enough, because, as I mentioned, know we have new large
language models coming up every week.
So making sure that you are is, you are not locked.
Your architecture must be flexible.
Second is scalable.
There's no denial fact that no, you have, your architecture must
be solid, to, Basically support the scaling scalability of it.
And so cloud, definitely the cloud and elastic scaling and hosting,
leveraging the cloud, capabilities, and you can achieve that.
The third is the security piece of it.
Security is very important and as we know that, the privacy and
security, and is very important.
And when we try to scale on gene use cases, and the final is monitoring.
we always leverage some tools, but making sure that we have the
right reward, on the monitoring of.
Generative AI use cases.
Now, when you look at, the AI driven ecosystems on the architecting
AI driven ecosystems, we,
infrastructure is really important.
and this is where, we have the compute, you have the storage,
you have the network, the monitoring, provisioning, et cetera.
and for the gene DBA applications, we have GPU, is really important to drive
and to train a large models and to drive that kind of applications, right?
The next is the data management engineering, which, I mean we have been
doing this since the traditional machine learning and AI side of it where we
need to integrate with various source systems, acquire the data, and just
store, and making sure that we have the right data governance piece of it.
Then we go to the analytics platform side of it, which is the number third
bucket as you see a third layer.
And which is when we build the models, we solve the model, we train the model,
we test the model, et cetera, and then we go to the modern software platform.
So what modern software platforms, talks about is our architecture
designed for real time or event driven or microservices, and making
sure that how AI can be embedded into various business processes.
We are working in FinTech, we are working in healthcare.
We are working in pharmaceutical, irrespective, making sure that no AI is
embedded into various business processes.
To drive value.
And the finally number five point is the digital channels and the
experience of the modalities.
And this is where you have mobile, your wave.
We have wearable devices.
If you're working in the healthcare industry, you know you have
wearable device and the data you are getting, you're going to
connect with various source systems.
Integrate with your platform and the seventh, the security.
Security is really important.
and as you see here, seventh, it's vertically caught across
in all the channels here.
so make making sure that you have the right privacy and security fast
architecture, it's a kind of approach.
You should not think that at the end it should happen from the beginning
of your architecture transcript.
Now to deep dive here.
this is what kind of, again, this is an illustrative, but if you see here,
number one, infrastructure on the bottom side of it, bottom part of it
where we have, we are leveraging cloud and distributor computing, moving away
from on-prem to, now cloud and then.
Multi-cloud approach as well.
this is when we leverage GPU, GPH service as well, the serverless
storage, scalable storage, infrastructure provision, et cetera.
Then we go to the data management engineering side of it.
Now, this is where, we ingest the data.
We capture, the batch or the real time change, our CDC, we transform the data.
We make sure that we apply the data quality tools.
and then for the generative bi, I'll just focus on the generative BI component here.
And back database is very important here in, second layer, which is
the data management engineering.
this is when we, leverage the semantic power of back database, right?
On the LLM side of it.
We have.
Open source, LLM, we have close, l lms, but making sure that no, based
on your use case, you are, leveraging the right, large model and make sure
that no, you're also evaluating right.
The selection of LLM is happening correctly.
then model serving LLM inference is really important.
and this is where.
it depends again on the use case that how much, latency that
you need is really important.
And finally, the final layer is the AI application and software.
And then finally, digital channels.
It's been a view of and how all the layers are stacked together, how
your ecosystems should be together and the options that you have.
Now what are the kind of strategy foundational choices we should be
thinking on generative AI building blocks?
Again, going back to the six layers that, I defined,
earlier, was the infrastructure.
So we have a lot of choices, right?
We have single cloud, we have multi-cloud, we have, SaaS, there
a lot of infrastructure, Nvidia, these infrastructure, the services
also going to very important.
and it's coming up as well.
Second is the vector databases, and vector.
We have open source and we have an A paid such as spine cone on EV eight.
there are a lot of enterprise level vector storage or application level.
and so on and so forth.
Maybe a database, approach, to please ensure that, make
sure that your architectures are really flexible and solid.
Then LLM selection and I again.
going back to the, my previous, what I just mentioned, making sure that you
are choosing the, you are selecting the right large language metal model for
your generated via applications and then the ML lops or LLM ops platforms, right?
Uh.
you can leverage, definitely open source.
You can go and bring the best of breed, kind of niche.
There are a lot of penalties available in this market.
and finally, multi-cloud architecture on General bi, right?
it's, it's a no brainer.
and the adoption of multi-cloud that know is happening these days
and which will continue to happen.
and, I have, sold various clients and on their regenerative billing, generative
applications, leveraging multi-cloud.
So that's it.
as that large language model, unstructured data is a fuel of, large language model.
but if you look at the traditional last 30, 40 years in the industry,
you, we did not pay a lot of attention or we did not apply the same regard
to unstructured data management.
Then we provided to structured data manage.
And this is very obvious, right?
It is very obvious and I still believe that.
Now there are three key data management challenges here, right?
One is.
Units within the farm.
The
second is in the storage piece of it, right?
And making sure that you have the right governance in the storage practices.
It should not be sitting in your email or it should not be
sitting on your hard drive, right?
It should be in a centralized and must be level correctly.
And then extracting and physical ETL side a it, making sure that you
have the right data pipelines with the right governance, strategy.
This when you'll be able to real leverage the power of the unstructured data
to drive your generated AI use cases.
Now look, and there are certain difficulties in building an operational
generative use cases as well, right?
the typical a ML project, and we already know that.
But in terms of technical challenges, definitely you see
hallucinated in adjustments, right?
but again, there are certain techniques that we must adopt,
to reduce the hallucinations.
your ML ops, you have, databases, video databases, you have cyber security risk,
you have diverse data, improves, and then you have scalability challenges
and deployment challenges as well.
and making sure that you have the right, key, ecosystems, or right.
Flexible architecture, in terms of to deal with the technical challenges.
In terms of the operational challenges.
And we already know that now because.
Ey, user interface and user expense is very important.
We, if we are building generative AI applications, it
should have the right roadmap.
and then, also right talent strategy.
And there's going to be another conversations.
And if you look at ations now,
again, going back to the tech stack and the architecture, It is very high level.
Again, this is illustrative, but I'll really emphasize on six key elements.
as we think of one is generative tech stack, and then data and
infrastructure side of it.
as you see the first, three, architecture elements, which is
the large language models, then embeddings and the back database.
these are really important, and, And what should be your decisioning
criteria, to choose LLMs or the embeddings or the Vector database?
now to choose the large model and embeddings, definitely
it's capabilities, right?
it is not just one large model is going to cover or solve
all kind of use cases for you.
So making sure that you have to understand the capabilities.
The second, the security aspect of it.
The third is an ease of use.
and the fourth is also multilingual capabilities that
also you need to think of.
In terms of the back door database, again, we go with the latency, the
performance and scalability side of it, flexibility and then availability.
Not all the back door database is available in every cloud, but making sure
that you understand that and what kind of back door database would be appropriate
for your organizations and also cost.
Cost is very important.
then in terms of infrastructure, again, it is a very simple
platform hosting and data store.
in terms of the platform, again, we already know that.
What should be your decision for considerations again?
It's a kind of a choice.
We have, compliance with the security team, the security norms of your
organizations, and the scalability to accommodate, new use cases, making
sure that your platform and the infrastructure is really robust, to
help you accelerate various use cases.
Now just a, you a couple of examples in here.
it's very, intuitive to understand, and how it fits together.
those six elements, so we have bunch of documents on structured data.
Then we chunk, we provide a chunking strategy.
We just create chunks or divide into chunks, and then we embed
and load into vector databases.
This is very simple, right?
and both metadata and document embedding are prepared here.
And, then we get loaded into vector databases.
There.
And then in, in your architecture, basically, when user query comes,
then again, we send it to the LLM and the LLM kind of refrige, making sure
that gets again, embedded using the same embedding technique, which we,
passed to load into Vector database.
And then the similarity of the semantic search happens,
against the Vector database.
Then it.
You get retried in a top or three or top five or top kind of retried documents.
and then, basically it gets fed into the LLM and then you
get the right answer, right?
this is just a very basic knife, approach of a rag re
augment generated applications.
So there is more panel I'll end off.
giving this last piece here, that if you wanna scale your generative
use cases, so your delivery team.
Most of the right, capabilities, starting from your use case definitions,
making sure that you have the, you are defining the use case properly.
you are identifying the, what the KPIs should look like.
How would you measure the performance?
how do you measure the KPIs?
This is very important.
What are the value you're going to get out of it?
The second is the definitely discovery part of it.
What kind of data is necessary to enable that particular use cases?
And then the, then third is the setup, right?
Do you have the right infrastructure?
Did you choose the right LLM?
Did you choose the right factor, database, et cetera, et cetera.
And then you start getting into the development piece of it, and then making
sure that you are designing the UI and UX and developing the UI perfectly.
your backend is correct, making sure that it is modular and flexible architecture,
leveraging content, relations of principles and the techniques.
Then you move to the deployment and monitoring.
And then finally application maintenance and support.
Now, in terms of the key resources, definitely, we need data engineering
team, data science, data engineers, data scientist, LLM ops or ML ops,
or DataOps, whatever you call it.
ML engineers.
then, product owner, product managers, site reliability engineers, UX developers.
So it's across, starting from the use case definitions.
From discovery to design, to development, to deployment, to maintenance and support.
that's all about this, presentations.
And I hope, this is very short and hope, you like the
conversa, like the presentation.
Thank you very much.
I.