Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi everyone, my name is Sam and welcome to my session Hong Kong 42.
Today we're gonna be talking about the different ways in which you can
integrate an AI your application.
So let me share my screen and we can get started.
A big shout for Kong 42 for selecting my talk.
It helps me spread the ledge with a community.
So lamb, right from data to constructive insights.
We're gonna be looking at the sample application and.
We will talk through its journey, how we have evolved the usage
of AI in that application.
My name is Sandeep.
I work as a principal solution architect at Tack.
I'm also an AWS community builder.
I've also been certified as an AWS, SA professional, and I love
building service applications.
I've been building serverless applications for the last six years,
and once you start building serverless applications, that's not turning back.
So let's get started.
like I mentioned before, Aztec is a company that builds all applications
for our clients, and we have a large engineering team as well.
Now with this comes the next set of problems, which is we have
multiple project managers who run different sprint schedules.
Some of them are on common, and each of them have their own way
of collecting feedbacks from the developers at the end of each sprint.
So this data was pretty much spread out and it was not usable
because it's not centralized and nobody knows what's happening.
And the other problem we had is our folks did not like to give negative feedbacks.
We don't know why, but every time we ask them to give a
feedback, it's always positive.
Everything is kick ass.
And, if we try giving them a scale, rate your repairs between
one to 10, everybody got that.
Every single person.
So we came up with solution to build an application with all
feedback that we use internally.
What it does is a simple interface, which allows you to create projects,
manage them, and it also sends a notification to the user that, when
the feedback cycle has started, now you need to provide feedbacks to your peers.
And all the data is again, stored in one single place.
So we have access to the data at any given point in time, everything
is now centralized, but we still had the other problem, which was
everybody getting 10, 10 ratings.
That's when we started experimenting with bedrock and AI
to see, how it can help us out.
So we start with the playground.
What we did is we started testing out a simple prompt, which basically says,
I'm going to give you a feedback.
Now you need to categorize them into three different categories, positive
energy, feedback, and reliability, and rate them between one to five, one being
the lowest and five being the highest.
And here's the feedback that the user has received.
So to do this, we started using Bedrock.
And Bedrock has a nice, chat interface which allows you to test
out different props, in Amazon Bedrock In the chat playground,
you can select different AI models.
Provide the same prompt and see what kind of responses each model is providing.
So seeing them side by side helps you understand the kind of responses
each model is giving, and you can select the right model for you.
What it also provides is the metrics on which it works.
So the latency, how much input to consuming, how much output
photographer is consuming.
So this also helps you to map out what model is going to cost you how
much, and you know which one you should be using for your use case.
With this being done, we got the prompt working.
The problem was consistency.
As you all know, writing a prompt is just the first step
in building your application.
The prompt has to be consistent irrespective of what kind of
inputs you're going to give to it.
The prompt must always deliver what it's supposed to deliver.
So that is when we started stepping into prompt engineering.
Now.
These are the basic rules that we follow for prompt engineering.
First thing you gotta do is set a persona or a role for your ai.
In our case, the role of the AI is to be an evaluator, which basically takes
the input or the feedback that the user has provided and take the elements
out of it and compress reliability, productivity, and positive energy,
and break that between quantify.
Next, you provide an action.
So you basically tell the AI how it needs to do the task, and you provide
the positive and negative cases.
So in our case, we had three categories, and if the comments were
not adaptive to the one of these categories, it would rate it minus one.
So these are some of the negative cases we had inputted.
So the AI can let us know what exactly the feedback is referring to.
Next, we provide the variables.
In our case, the variable is just the feedback that is in input from the user.
And then we also said, what is the output format?
But we wanted adjacent output format, which has these three items,
productivity and positive, negative.
And it should have a value between one to five in negative cases
is going to have a minus one.
Now to do this in Amazon Bedrock, you have a prompt builder playground, which allows
you to create a prompt and test out all the different variations of it as well.
So you can set a prompt, you can set the variables that are
there in the prompt in our case.
That's the feedback, and you can test it up.
It also allows you to create different variants of your prompt so you can see
which prompt is working best for you.
You can test it out against the same model, or you can test it
out against different models based on your requirement.
Now, if you look at this example on the left side of the screen, we are
asking the prompt to be lenient with the ratings, and on the right side,
we are asking you to be respect.
It is just this one small line of difference which affects the output a lot.
You can see the responses that is coming from the ai.
It's the same model, but there's only one line of difference.
This is how you experiment with your prompts on different types and see which
is the best prompt that works for you.
Now with all this being done, we created a chat bot interface where the
user can, provide the input, provide the feedback, and three months go by.
We have a lot of data.
We have about 500 feedbacks across the organization.
Now we start to think what we can do, with this data, can we make it more
insightful to the user or to the people so they can start improving on themselves?
And that is when we started looking into vector databases and drag.
Before we jump into what we did with the application, let's go through some basics.
What is a vector?
A vector is basically a mathematical representation of the data
that you have provided to it.
So it's basically an array of numbers to oversimplified.
A dimension is the property of the data that you're providing.
So let's say we are providing fruit, which is apple or a fruit that
is orange, and this for different properties, like what is the color?
What is this sweetness?
What is the sadness?
All these becomes, each, becomes one of the dimensions, which is
basically a characteristic of the data that you're providing.
And what are indexes?
Indexes are the entry points for your database.
So it can search a process.
So let's say you have an index for fruits, and when you provide the input,
it's going to search on these indexes and find out what you're looking for.
Okay, now what is embedding?
Embedding is basically the process where your raw data with j text or
any kind of data is taken, converted into vectors and stored into the
databases, into the right indexes.
This entire process is called embedding.
And what is rag?
So basically when you ask a question like list some red colored fruits, AI
understands what is the intent of your question and creates a vector out of it.
Now that vector is used against the indexes that got created and it does the
nearest neighbor search, or there are multiple search patterns, but nearest
neighbor search is the most popular one, and it identifies the data that you're
looking for and provides the response.
In this case, it will take that and provide the first what has happened.
So if you were to do it in Amazon Bedrock, there's a very easy way to get started.
You can start with something called, let's chat with the document.
So Bedrock has this, functionality called knowledge Basis, and over there you can
create different knowledge basis or all.
You can start with the easy one, with the chat, we can talk about.
As it sounds, you just upload a document in the portal and
you can start chatting with it.
It gives you a nice chat interface.
You can provide a custom prompt if you want, but that's basically it.
You just upload a file and ask questions on that file.
Now to create your own database, there are multiple options.
So firstly, you can create a knowledge base with a vector store, which is
the most common pattern, but with the recent reinvent announcements.
They also improve these functionalities.
So now you can create a knowledge base with structured data stores
like Amazon Redshift, which is the data warehousing solution.
Or you can also create them on Berra Gen AI based indexes.
So for our use case, we started with the simple one, which is the vector store.
So the way you do it is you provide a data source.
So in our case, we use S3.
So all the feedbacks that we got from the users, we started uploading them
into S3 bucket, and that's the source that we provided for the knowledge base.
Next, you have to select the parsing strategy.
Now, what is the passing strategy?
If you're using, I mean if your data consists of J text or simple
data, then the default Amazon bedrock parcel is good enough.
But if your data is complicated, like it has some kind of media content,
images, videos, audio, anything like that, then you probably need to use
another model to do the parceling.
For our use case, we use the default parcel.
Next comes the chunking.
Chunking is basically a way of splitting your data into si, into
smaller, in smaller sizes and get it stored in the database.
Think of it like a record that you do in a regular database, except
that one chunk is stored in a particular data vector database.
So this is very important.
So here's what happened in our case.
So we started with default chunking.
So we selected default chunking, gave the S3 file, S3, market as a source.
It did the chunking.
And when we asked the question, can you provide, feedbacks about this
user, it did provide the feedback and it did summarize it really well.
But the problem is it summarized the feedback across all the users.
So if I'm asking feedback for Sandeep, instead of giving just my
feedback, it was giving feedback for everybody else as well.
And that's when we understood that the default chunking that
we're using is not working out.
And then we tried our different options and eventually ended up with no chunking.
So when you select no chunking, what happens is that every single
file you provide in the S3 bucket is considered as one single chunk.
So my file basically becomes one single chunk and everybody else is as well.
So now when I ask the question about my feedbacks.
It'll query only my set of feedbacks and summarize and answer
the question that I'm asking for.
Now, if you want some advanced chunking strategies, which none of
the current ones fit, you can also use a Lambda function to do some kind of
custom parsing and add your own logic on how you want to on these files.
Next, you select the embedding model.
So like I said, your data needs to be converted into vectors, and
that is done by an embedding model.
In our case, we use the Titan Embeddings, but there are other
offerings available as well.
Now, when you select the embedding, have a look at the
vector dimensions that it creates.
So in this case, the Titan Membranes creates thousand 536 dimensions.
So every single piece of data that you provide is split in 2,536 dimensions,
each of them having its own properties, which signify a particular characteristic
of the data that you're providing.
Next, you select the vector database that you want to create.
Now, AWS has made it easy.
You can just select a quick create and it creates a database for you.
There's.
You don't have to configure much, but if you're already an advanced
user, you can select a vector database that you have created by
yourself and provide the specific information that it is working for.
So when you select the, quick create, there are a couple of options.
So Amazon Open Search is the default or the most widely used vector store,
but with the recent announcements.
There is also support added for Postgres SQ and Amazon Neptune, which
is a high-end analytical database, which, you can run graph queries.
Graph queries is more, is like an advanced rag.
so RAG does relationships on one level graph, does it on two levels.
So that's the simplified version of it.
So for our use case, we started testing out with open set serverless.
So once you select, if you want to create your own vector databases, these are
the options that are available for you.
So once you select the database, on a quick create, all you have to do is click
next and your data source is created.
Now, after the data source is created, you need to do something called as synced.
So that's when.
Amazon Bedrock actually takes the file from S3, creates the embed
and stores it into the database.
Once that is done, now we can start, querying your database.
Now there are multiple search options that are available by default hybrid, search
work for us because it searches both on semantic and on text, but it's again,
left to you based on your data choice.
Next comes the number of source chunks.
So our use case was very specific.
We want only one particular file for one user, so the number of chunks it
had to return is only one, but based on your use case, you can increase
it to up to a hundred and, retrieve the data that you are looking for.
And then you select the model that, you want to run the database on Next.
Knowledge base also has a custom prompt that you can provide.
So by default when you're trying to query a knowledge base, there's
an buil prompt that Amazon use us.
For some reason, if that doesn't work out for you and you want to, tell
the knowledge base how it needs to query the database, you can do that.
You can provide a custom prompt where you mention or you give a prompt,
which says how the data needs to be understood and what kind of questions
the users are asking for, and, to make more context out of it and give you
the best or most relevant answers.
So this is what the result looks like.
So our queries, have two things.
Email and query.
Email is who's requesting it and what is the query.
So let's say in this case, manic is asking, provide my top
three points of improvements.
So what this does is it takes monk's file, which has all the feedbacks that
he has received so far, identifies the intent of the question, summarizes
all of them, and gives the feedback.
Same with the next user.
We provide the email ID and ask for the say, points of
improvement and things like that.
So it is going to take the feedbacks that I have received, and it's going
to summarize that based on the question that it is that you have asked.
Now with all this in place.
We enable this chat bot for.
So as an admin, I can query other people's feedback to understand how they're doing,
what improvement points I need to do on.
So this essentially started as a mentor mentee relationship.
So I. What we started doing is as mentors, we would query our
mentee's feedbacks and ask questions like, what, how are they doing?
Or, what are the things that they need to improve on?
And then we get a summarize result based on what everybody has said for them.
And we take those points and start guiding the mentee on how
they should improve themselves.
So this is how it started.
And then we thought, why not just sustain, enable it to the users themselves?
Then they can ask questions and see how they're doing and they can improve it.
They can improve themselves.
They don't, we don't need to have this, feedback sessions every time.
They can query the question whenever they want and work on the points
that they need to improve on.
But there was one cache for this.
The current chat bot, allows anybody to request anybody else's feedback.
So that's when we started looking at agents.
So, as an oversimplification, an agent is basically a custom functionality
that you want to put in, to the thought to, to the thought process of an ai.
So in Amazon Bedrock, you have, agents where you can create your agent,
specific agent that you're looking for.
Select the model that it needs to run on.
And the beauty of it is the actual agent interception that happens inside
an AWS Lambda, which is a serverless component, and you can create
multiple agents for the same, multiple lambdas to be used in the same agent.
And you can change them all using the prompt that you're writing.
So when you create the agent, you provide the prompt, it says how it needs to
interact with these agents to provide the response that you're looking for.
But in our case, it's very straightforward.
So we are telling the AI agent, you are going to receive a query, and
this is the flow you need to follow.
Your flow is identify the user ebit, identify what sort of
question they're asking for.
Is it a question for themself, or they're asking the question for other users.
So create these two inputs and call an action group.
So action group is basically that it needs to invoke a lambda.
So this Lambda is going to get these two inputs, which is the email and
what kind of query they're asking for.
Inside the Lambda, we are calling a database, which is having the list
of admin users and regular users.
Now, if the question is by an admin, we allow the request.
But if a user, regular user is asking a question for other people's feedback.
We deny the request and we have instructed the agent in such a way
that based on the response it receives from the action group, it should either
allow the request or deny the request.
So when you create the agent, the first thing you do is provide
instructions to the agent.
And you can also enable memory by default, it is disabled, but, if you want to have
consistent or sessions, you can enable the memory and consume it accordingly.
Next you create the action groups.
The action groups are basically a bunch of lambdas, and, you provide the name of the
action group in the instruction that you write for the agent and tell what it needs
to do or how it needs to call this action.
And what, what is the next action that it needs to take.
Apart from this, to a agent.
You can also provide the knowledge basis, so the knowledge base
that we created previously.
You can add them over here, and now the agent has access to the know
action groups and the knowledge base.
So you can basically tell it, take these three actions, query this knowledge
base, and then call another action.
But anything that you want based on the workflow that you're looking for, can be
done by using action groups and knowledge bases in conjunction with N Agent.
So this is how we define an action group.
You start by giving a name.
You select some basic parameters, and you create the Lambda.
Now this is the beauty of it.
So here I'm specifying two parameters, query type, and email.
In my prompt, I have mentioned the instructions that it needs to make these
two parameters or identify these two parameters and then call the action loop.
So when the lambda is invo, it is always going to get these two parameters.
Now, this is how the Lambda looks like, is just simple.
code.
We are just having a list of admin users and, when the input comes in, we
are just sticking against that list.
If it's an admin, allow the request.
If it's not, you're just going to throw a response telling you're
not authorized to run this query.
And then the agent takes this response and, summarizes that and
provides it to the end percent.
And this is how it looks like.
In the first one, the email is random at anstat io, which is obviously
a fake email, and the query is list Sandeep's, top feedbacks.
So basically some other person is asking my feedback and obviously
random is not part of admin's group.
So the agent is going to respond, telling you don't have access to run this query.
In the next scenario, I, the email is sand t io, which is my email id.
And I'm querying my feedbacks.
just list my feedback and summarize it.
Integrate 20 words, and of course the AI is going to query the knowledge base and
provide the result for it accordingly.
Now, once we added this functionality, once we are able to segregate the
queries between what a user is asking and what an admin is asking, we
enable that chatbot to all the users.
everybody could look at their queries, ask questions about how they can improve
themselves, or what are the more three positive comments, what are the negative
comments, and, get a dynamic of how they're doing and move forward with that.
At this point, we wanted to see what we can do more, what we can push the
system to do more, and that's when we started looking into an ambitious plan.
What we wanted to do was, we had the skill assessments for each of these users.
so we wanted to see if we can integrate all of that into one platform.
And at the end of it, if a user is going to ask a question,
how can I improve my career?
Improve, we take the information of what the feedback, of all the
feedbacks that user has received.
We take their skill sheets.
We take their current role designations, what is their next role designation?
Summarize all of this and point the user into the right
direction to get their promotion.
Imagine doing this.
You don't have to sit on, on a call with your boss at the end
of a year to know whether you're going to get the promotion or not.
You can query this chat bot, say once in three months or once in a month, and see.
What you required to do to get that promotion.
We call this section, divide and Concur.
So you can do all these things in a single prompt or a single agent, but it
gets the system complicated every time.
you write a prompt, which has multiple things, multiple steps to
do, the AI becomes in such a system.
That's why we call this divide and conquer, where we split the execution or
split the functionalities into smaller AI chunks and consume it that way.
So Amazon Bedrock has a beautiful way with, let's just do this.
They're called pro flows.
So pro flows are basically, if you're aware about step functions,
is basically a step function for ai.
It allows you to create different prompts and, in Lambdas S3 downloads,
knowledge basis, agency, all these things, and it allows you to create a
map out of it or a workflow out of it and lets you take the action that way.
So in our case, we split the entire functionality that we
wanted into smaller pieces, which allows us to give more control.
The other advantage you get by doing this is each of these chunk.
You can have a different AI to do that particular thing.
So if you're having some small classification that needs to be done,
you don't need to run a CLO 3.5 or a CLO 3.7, you might as well just run a smaller
model, which is more cost efficient.
It might take a little bit longer time, but it is way more cost effective than
using a very large, expensive model.
But this is what we did.
So our first step is a prompt, which is to identify the intent of the user.
So this prompt just identifies who is the user and whether the question
is for them or for somebody else, and then it passes on to a Lambda, which
does the role check for the user.
So this lambda is connected to a database which has the list of admins.
And the regular users.
So based on the input that it's getting, it's going to identify
whether the question is, from an admin or a user, or whether they're
asking for themself or for others, and, provide the response accordingly.
And then it goes into a condition block.
In this condition, it checks the output that is received from the Lambda.
So in the Lambda, we say next step is proceed, or next step is n. But based on
this, it is going to take the next action.
If the action is to end the execution, we have, it moves to a flow output, which
is basically the end of the prompt flow.
If it's not, then it goes into classified cushion.
Now, this is where the beauty comes in.
So we wanted to enable the chat bot to do much more than just
summarize feedbacks, right?
So this classified question identifies what kind of question the user is asking.
Is that a question about their feedbacks or is it a question about
their overall career at the company?
So based on this, it is going to take the next steps.
So after classifying this information, it is going to call the knowledge base.
So the knowledge base has the information of all the feedbacks
that the union has received.
So irrespective of whether a question is about their career or whether the
question is about their feedbacks, we need to query this knowledge
base to summarize the feedbacks.
After getting this information, it goes into this lambda,
which is called Data enricher.
Which gets the intent of the question.
And, whether it is a question about their career or whether it's a question about
the knowledge, feedbacks and it also gets the output from the knowledge base.
So everything that the knowledge base is summarized and the
raw data it gets offered.
Now this lambda.
Based on the, intent of the user, it is going to query multiple other sources.
The other sources is going to query is our classification of, user roles.
Basically, what is their current designation, what is going to be
the next designation, and it is going to query their skill sets.
What are the skills that they have and to move to the next role.
What are all skills they need to have as a mandatory and inform It gets all this data
from multiple sources that we have stored.
And it combines all this information and sends it to the final prompt.
So this is the summarizer prompt, which basically takes the input of all the
data that we have received so far with the user question and so on, and then
it summarizes all of this and provides a neat response to the user where the user
can, use this bot to help themselves out.
So this is what it looked like.
So let's say, the user is asking a question about somebody else and
there's a regular user, the execution fails, we throw a unauthorized error.
Next, a user is asking a question about their feedback.
So it queries the knowledge base, it doesn't query the role, it doesn't query
the, skill sheets or anything like that.
Just plain, feedback summary.
And this is the interesting part.
So in this.
Particular query.
The user is asking how to improve my career.
What are my setbacks?
So we query feedbacks, we query this person's skill sheets,
we query this person's current role, what is the next role?
And summarize all that information.
And this is how the AI response, it provides the strengths based on
all this information and based on what people have said is going to
provide the areas of improvement.
It also says to get your next, promotion what you need to do.
and it's not just some, random information.
All of this is accumulated out of the feedbacks that this person has
received and the expectations that we have set for a particular role.
So if you take these points seriously, then you will improve in
your career and you will be in the right track for the next promotions.
Another nice thing note about this is the query also asks, what
are my setbacks for this user?
He has not received the feedback, which is so negative that it has
become a setback in his career.
So the chat bot is going to respond to, there are no major setbacks.
It is what it is.
So imagine doing this in your company.
You don't have to wait at the end of the year.
To get to know whether you are getting the promotion or not.
You can ask a chat bot once in three months, once in a month, anytime
that is, and see if you're on the right track to get a promotion.
Imagine how simple your conversation with your bosses at the end of the year
by looking at these feedback, you can, you can provide the data to your boss
telling why you need that promotion.
Now coming to the pricing.
So Amazon Bedrock, like any other, AI provider is charged based on
the number of tokens you consume.
both the input tokens and the output tokens.
Now, if you're using the knowledge base, there is no explicit charge for using
the knowledge base, but you're charged for the underlying vector database
and the queries you run on top of it.
It's basically like you're not charged for cloud formation, but
you're charged for the resources that deploys a similar fashion.
Same thing goes with agents.
You're not charged extra for using the agents, but you're charged for
all the resources you create under, like the Lambdas, the knowledge
base, and the queries that you run, how many tokens it consumes.
Now, coming to the time, this is the most, fascinating part, at least for me.
To build the entire thing I showed in this presentation.
It takes less than 30 minutes if you know what you're doing.
It just takes less than 30 minutes.
The first time I did pro flows and everything else.
The entire time it was less than two hours.
That is how easy it is to get started with services like Amazon Bedrock.
It just makes your life so easier.
You don't have to think a lot on what is happening.
You can just build it, test it, and then understand what exactly
is happening in the system.
And that's it for my session.
Thank you everyone for joining.
You can reach out to me on LinkedIn, gi or you can email me at my
official email id Sand patan stack.
I, it's a pleasure meeting you all and I hope to catch up with you soon.
Thank you everyone.