Accelerate AWS Well-Architected reviews with Generative AI

Video size:

Abstract

Transform your AWS Well-Architected reviews with generative AI. Turn time-consuming manual assessments into automated, intelligent evaluations that deliver deeper insights and actionable recommendations in minutes instead of days. Save time, reduce oversights, and scale efficiently.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Are you looking to accelerate well architected reviews and saved countless hours on manual assessments? Hello everyone. My name is Sha Bustani and I'm a Senior Solutions Architect at AWS and I welcome all of you to my session. Accelerate AWS well architected reviews with generative ai. In this session, I will talk you through practical ways Amazon backrock can support your architectural assessments, right? From identifying potential areas of improvement to streamlining your review process. When you are building systems at your organization, how confident are you that those systems are being built using the best practices for the cloud? And this is where AWS well architected framework plays an important role. It'll help, it helps you understand the pros and cons of decisions you make while building these systems on AWS. By using the framework, you learn architecture best practices for designing and operating, reliable, secure, efficient, cost-effective, and sustainable systems in the cloud. The AWS framework is based on six pillars that helps you build, stable and efficient systems. On AWS it is a set of questions and design principles that are spread across these pillars. Operational excellence, security, reliability, performance efficiency, cost optimization, and sustainability. Alongside the pillars are AWS well architected lenses, which provide guidance with a focus, on a specific industry or a technology domain. For example, data analytics and financial services industry lenses. To evaluate the health of your workload, you essentially answer a set of foundational questions based on the framework, pillar and lenses. In addition to align with your own organization's, operational plans, internal processes, or maybe industry, you could define and manage custom lenses. Creating a technology solution is a lot like constructing a physical building. If the foundation is not solid, it may lead to structural problems that may undermine the integrity and function of the building. Similarly, if you neglect any of these, six pillars. Operational excellence, security, reliability, performance efficiency, cost optimization, and sustainability. When architecting technology solutions, it can become a challenge to build a system that delivers, functional requirements and also meet your, expectations. On the contrary, if you incorporate these pillars, it'll help you produce stable and efficient systems, allowing you to focus on functional requirements. There are three parts to wafer framework. First is the content, so the framework with the design principles, pillars with the questions and best practices along with the white papers that gets published on AWS website. Next is the self-service tool. The tool, that uses the framework to carry out, the reviews of the workload in the form of question and answers. This is embedded within AWS Management console, as is own dashboard. It allows you to generate the PDF reports, for improvement plans, and then you have got the data that you have from doing the reviews, including metadata, and identified improvement plans. As organizations expand their cloud footprints, it brings challenges and opportunities to scale. When it comes to adhering to well architected framework, for example, manual reviews may become time consuming and resource intensive. Different teams may apply the well architected principles inconsistently across the applications or across the teams. Keeping pace with the latest practices may get tricky, especially with the new services and features being released continuously. And do remember V four is a continuous activity. And the scaling reviews for numerous or large architects, architectures can become difficult. To overcome these challenges, builders like me at HWS have come up with a vision to accelerate AWS well architected frame of gloss and enterprise adoption by leveraging the power of gen EI and provide organizations with automated comprehensive analysis and recommendations for optimizing their AWS architectures. The proposition here is to use the wafer content tool and data with generality AI to drive well architected assessments, and this approach brings multiple benefit to the enterprises. For example, the rapid analysis cuts the amount of time spent on WA four reviews resulting in time efficiency gains. AI powered analysis ensures consistent and thorough evaluations and reviews across workloads. Optimizing resource allocation and reduced manual efforts result in increased, stop productivity and also shift alert mentality, which result in overall cost saving for the organization. And you can perform more, frequent reviews or time refining feature assessments and improving feedback loops. The other benefits too as well, so we talked about scalability in the beginning, so ability to run multiple reviews simultaneously would be a good thing, for the organization. Also taking it to the next level, interactive exploration through chat interface to dive deeper into the assessments, asking follow up questions and getting a better understanding of the recommendations. This interactivity enhances engagement and promotes, thorough comprehension of the results. So when it comes to building, such a system, you would, using Geni, prompt driven approach, appears as trivial choice. the prompt is your interface with the model, and, crafting, the right prompt is crucial for the output generation. So prompts are similar to decision process of, human beings by learning, from analogy, by tweaking the inputs or the prompting of the words, you get the desired output on completion and the slight changes, or tuning to the prompt can have a significant impact, on the outcome. Of course, it takes trial and error, and also an artistic FLA to it as well. But effectiveness also depends on how model was framed. When it comes to prompt engineering, there are multiple types. You have a zero short prompt, or prompt by instruction, which uses, LLM out of the box. It allows you to, language models to perform on the task. They have not been explicitly trained on. Then you have one shot or a few short prompts. What it means in here is that you provide one or more pairs of problem and solution, solution pairs, as short, as, to the input prompt. And this helps to guide model performance. And then you have chain of thought prompting. It addresses mul multi-step problem solving challenges in arithmetic and common sense reasoning tasks. It generates intermediate reasoning steps. We are making human review of tho human train of thoughts before it provides the actual answer. in principle, you would think, that, in the way, in the current context of well architected framework, you could create a series of, prompts from the framework, questions, and sequence through LLM to solve the problem. That sounds straightforward. However, there are limitations, with the prompting. they, they have a poor memory, but work with very limited context and then accessing, and it relies on accessing extra knowledge sources to complete this task. And this is where the retrieval, augmented generation, or rag or architecture offers more. Retrieval, augmented, generation or re is a process for retrieving facts from knowledge basis to ground LMS with up to date, accurate and insightful data supplied in the prom. It has three stages in the retrieval. In the retrieval stage, the relevant content is fetched from the action, knowledge base or data sources based on the user query. The retrieve, context is then added to the user prompt, which goes as an input to the foundation model as part of the augmentation, and then the generation, the response from the Foundation Mole model is now based on the augmented, prompt. Now that we know the basic, basic functioning of rag, how does it actually work? First, you need to optimize the data, for rack to work, and it is done as part of the data ingestion layer. First of all, the data is sourced, from the relevant data sources. Then that data from the data sources chunked, wherein the large pieces of text are converted into a small segments, then stored in the vector database using embedding models. Then during text generation workflow on receiving the user query or question, a semantic search is done and retrieved passages into bracket R are ranked and pause as a context that is augmentation or augmented into bracket A, into an A model, which in turn generates a response complete the G aspect of it. So based on what we have seen so far and learned so far as part of the session, we are to now talking about an improved approach, wherein we come, we take, the content, data and tool and combine it with Amazon bedrock and knowledge basis for Amazon Bedrock and LLM to drive well architected assessments. Building on this approach, I'm excited to share that. we, builders at AWS have, developed our comprehensive one click, click or few click deployable solution to facilitate and expedite AWS well architected, process. It uses workload documentation. It means you can upload your own solution architecture documents or technical design documents, to the application. It uses mainstream inputs, data sets, so it accepts a PDF file format that you can, convert your documents into and upload it to the application it uses or ingests. AWS well architected documentation from AWS public website, into the internal knowledge, knowledge base. And it's accessible. You can extend it to include your own organization or industry review standards using custom lens. So entire stack is created, as infrastructure as a code with, AWS cloud deployment kit, CDK in shop that you can deploy with a few, one or few clicks. And there are a number of artifacts that get created, or used as part of the stack. So we are using Amazon Bedrock with Anthropic Cloud, LLM, for the LLM, and then we are using open source serverless as the Amazon Bedrock knowledge base. Alongside we use, DB and which is our, code database for storing, the review run information. In addition, we use Amazon's SQS EC2 instance, which host the stream rate front-end application. In addition, there are other services such as, S3 buckets, and Amazon Kognito for user management. The heart and the core of, the reviewer is based on AWS Lambda and step functions, which manage the wafer analysis runs. In addition, we use cloud mode distribution, with, a LB and initial web rules for you to improve upon. Let's see how it works. I. For the demo, I have created a fictitious payment solution for a fictitious company, any comp, any company that receives an, process, payment files to drive its, downstream systems, and ML workload. The wafer accelerator sample does not require the documents to be in any particular format or a template where it's working. However, the quality of the documentation that's uploaded for assessment is an important factor. Detailed architecture documents will result in better influences and hence better assessments. However, in this particular example I have, created a solution architecture document for the Fictious solution that I described, earlier. and this is a pretty comprehensive document. It's page, 37 pages long. which is typical with these kind of documents. And we will provide this as an input to the wafer tool, wafer accelerator tool, which would review and provide, architecture assessment in the context of this document and the wafer driven knowledge base built, from the AWS well architected framework of build documentation. This is, the landing page for the wafer accelerator sample. So I have already created a fictitious user, so I'm going to log in using that user. I. Before I proceed any further, the user, that I use to log in is part of the Cognito user pool that's provisioned as part of the CDK stack. All you need to do is go and create, or provision the users in that user pool. So this is the welcome page. So you have got a few options in here. The most important one are. New way for review and existing review for review links. They're available here as well, so I will start with a new way for review. There are two analysis types in here. One is a quick one and the other one is a deep with well architected tool integration. So quick mode is, as the implies is a quick mode wherein we club all the questions for a particular, pillar into a single prompt to speed up, the inferences. But it's a good starting point for the submitters. for example, if I am a solution architect, if I want to submit, an architecture document for review to potentially, architecture review board. Then I would like to mark my own homework and I could come to this, generate a quick, way for assessment and see how far I am from, or my documentation is away from the well architecture best practices and the responses, or the assessment in this particular mode beside completely decides completely within the application. And then you have got the deep with well architected tool integration. This is a d as a name implies, is a deep review wherein we, draw inferences for each and every question separately from the selected pillars. And it's also integrated with the AWS well architected tool. It means, on completion of, your review, we do get the review output in the application as well as you have got a workload that gets created in AWS architected tool, and we'll see it a little bit later. So in the interest of time, what I'm going to do is I'm to going to, create a sample, create a quick review, and I'm going to call it wafer accelerator, quick demo two. Okay. Give it a description, then you can select the environment. Industry type. So in this particular case, because a payment solution, I'm gonna recall or mark it as financial services. I'll interrupt fictitious name for the reviewer and select the lens. I would like it to be reviewed. So right now at the time of demo, we are supporting three lenses, the core AWS, architected framework lens, and then data analytics lens and financial services industry lens. So for this particular demo, I'm going to select AWS well architected framework. The user is, created, biofield is populated by the username. I used to log in the application. Now I could go and select the pillars I would like, it to be, for, for it to review. So I have the option to select, all the pillars. However, you don't have to select all the pillars. You can only select the pillars you are interested in. For some of the customers, if they are, interested only in security and reliability, they could select those pillars. But for the demo purpose, I'm going to select all the pillars. And then upload the document I mentioned earlier. At this point, I have populated all the fields. I'm going to create a way for review, and as the MEA says, it has been created successfully. So if I go back to this, okay, it has commenced the processing. So let, I would let it complete because it takes, a little white for it to complete. but in the interest of time and for showcasing purposes, I've already created a few, demos, in the past. So let's select one of them. So this is the metadata that you, that we entered, when, creating this particular review. So the first step is to create a solution summary. So we use a document that has been uploaded and use the geni to, create a solution summary. It means the reviewers don't have to read or the entire document, or not. All the reviewers need to read the entire document. They can read the solution summary. And this is, followed by, the. By questions for all the pillars that have been selected. So for this particular one, I had selected all the pillars, and hence you can see all the pillars, in the tab. So starting with this, you have got the question at the, top. This is followed by, quick assessment and then it tells you which best practices have been followed, recommendations and examples. In this particular case, document, doesn't, did not have enough information on this, and we have devised the prompts in such a way that we can't find the relevant information. It calls back and says, we don't have enough information to provide the assessment. So coming back to the recommendations, it gives you a list of recommendations, to improve your architecture, followed by risks that it, gen I has identified and citations from the AWS well Architected Framework documentation. And it does it for all the questions, in the pillow as you can see. And it does the same for other pillars as well. So in this particular case, it did identify the best practice, from the well architected framework in the document, which was to use separate workloads using, accounts. And in here it gives you citations and risks, and it does it for all the pillars that were selected for the review. Let's have a quick look at, the Deep Review. The deep review works, in a similar fashion you have called the summary or metadata information followed by solution summary. then you, it gives, it responds with, the same, kind of structure answering all the questions for the selected pillars. One thing that it doesn't do is the risk because we want the well architected tool to reduce the risk based on the choices that get selected. Otherwise, it's pretty much the same. Okay? And it does it for all the pillars. And let me select, the quick demo too, which is just kicked off. So that's still in progress, but as you can see, it has already created the solution summary. So it's created a summary. Now it's trying to answer individual questions, and it's, take a little bit, time to do that. Okay. Another feature I wanted to showcase to you, let me select, the deep, I can select any of them. Let me select the deep tab. Okay, so we talked about ability to chat with the content and to understand, better understanding, have a deep dive, of the review that has been carried out. And this is where this, chat functionality is, super helpful. So let me select, the solution summary. So I would like to check with the content that was, in the summary. So for example, I could say which AWS regions are used. So as you can see, this information isn't there already, so it should be able to reduce this very quickly. So that's fine. This is good, but if I say, which AWS EC2 instances are used. Then of course this document, this con this information is not then in the summary and ans is not able to find. So I could switch to the document and ask the same question again and let's see what it does because, this is, this has been documented in the solution architecture document is able to pick up that information and provide me that, that data. This is quite a handy feature and this you can do for, do it for, all, all of the sections that have been generated. Okay? So one of the things which I mentioned earlier was, integration of the deep mode with, the well architected tool. So let me go to the well architected tool in the AWS Management Console and show it to you. So this, I'm back, in the console after sign out. If I go to the workloads, I can see a workload and the name matches, with the review that we had carried out. So if I search for it, I can see it's the, oops. Oh, I'll do the other way around. I'll select it from here and search it there. Just to prove the point. yeah, so it's essentially the same. So if I click on this. This was purely generated by the Vapor Accelerator. the sample itself, we didn't do anything. We just created a review in the application. And this is what it has done. And if you see, what it does is, it goes in and creates a workload, answers, all the questions or tries to on, try to answer all the questions, and then it creates a milestone. for it as well. this was a baseline, version that was created in Geni and you could start to build on it as part of your human review, human review process. So I could go in there. There's an, I want to show you a, an interesting feature. So I will go in there and I'll look at the review. So in the very beginning, I told you that you need to answer a set of foundational questions as part of the review, and each of these questions typically have a number of choices underneath. So this is not something I've done. What we have done is we ask the LM to say, Hey, when you are doing an assessment or trying to answer this question, can you also answer? in terms of, or select the choices that are applicable based on your assessment. So these, tick marks or these choices were actually, done by the l and m itself. And as you can see, not all of them have been selected, and it does it for all of them. And the good feature or another feature is that the population notes. So you, you get the assessment in the application, but at the same time, the generated assessment is also populated, in the node section as you can see. And it does it for all the questions for the pillars. So let me go back to, the review. So what I could do in here is. Go back to that and generate a report. Okay, so this is the report which I've generated from console. So if you look at this, it has made some choices based on the gen assessments and as a result of that. Those are all the questions. So going back to the console. So if we look, the choices have resulted in a number of risks, in response to the choices that gen AI things, are applicable or have been, implemented. And as a result of, ruling out the ones which have not been selected by Geni, the tool has generated risks. So if I go and, click on that, I can see those risks underneath, the individual questions. So just to summarize, what I did was, I went in, I uploaded an architecture document. I carried out, a deep review, it, and after a while it created an assessment, in the UI as well as it has created a well architected workload with a milestone that you can use as part of your human review process. Let's go back to the presentation. Let's see what's happening behind the scene. You've seen the ui now. This is, the overall architecture for Wafer Accelerator. So if you recall, we talked, we looked at the basics, of, rag architecture, and the first step is to do a data ingestion. So as part of this, what we do is, we download the files, the PDF file format from AWS public website, put it into an S3 bucket, and use, a Amazon Bedrock knowledge base, and ingest it into that. And the way it works out is, the embeddings are created and they are stored in Amazon open search for serverless, which acts as the vector database. So the embeddings are stored within the open search serverless, vector database. And this is done as part of the CDK, deployment itself. So you don't have to do it separately unless you want to update, to a later version, of well architected framework, in the future. So the second part is, which I just showed it to you. So we have developed an application using a streamlet, which runs on an EC2 instance, and it's fronted front, front-ended by, application load balancer and cloud front. We have created, we have integrated the web with the initial set of rules, which you would like to enhance based on your requirements. and then you have got the Amazon cognitive user pool. So if you recall, I log in the application using a fictitious user, but I had to provise in that user. So that pool is created as part of the IEC, but, you need to park provision the users. So once the document, is, uploaded. or submitted for review. It gets uploaded to an Amazon S3 bucket and an S3, and the application sends an SQS message out to trigger the WA for review. This message, SQS message is received or, received by a lambda function, and that in turn, invokes, the wafer review, step functions, which is the heart of this, reviewer. And, if it happens to be a well architected, sorry, a deep, review in that case, what it does is at this point in time, it goes and creates an empty, workload in the tool. The next step, in the process is to get the cont extract the document content. For this, we are using Amazon, text track, and, Once the data has been extracted, it goes to step six, which is about in working, retrieve, and in work APIs. So this is where, the query generation happens, and we'll look at it in a little bit more detail in the next slide. But the idea in here is to retrieve the relevant in buildings, based on the question in context. And then, argument the query the prompt or say that we have developed with that information and the uploaded document content, and ask the l and m to answer those que the question. So let's, take an example, of, the first SEC question from the security pillar. when this is, this, is, being reviewed. first step in there is to find the relevant, data from the data, from the knowledge base for this particular questions. And all the passages that have been found are then ranked based on the relevance, and then this is added, to the context, as a context, in the query, which is passed on to the l and m. The review progresses, the staff function keeps on updating, individual, questions, within the tool. If it's a deep, review. If it's not a deep review, the information is stored purely in Dymo db. in case of D Deep, it's stored in both the places. The next step there is to, refresh the screen. as you were saying, and I was checking the status so the UI can, user users can always refresh the screen and check the progress of the reviews. Then I showcase you the chat functionality wherein I went in and asked a few questions and it answered for me based on the content, that was generated, by the review. And it also created a milestone in the well architected, tool. I showed it to you earlier, and that becomes a starting point. It could become your starting point, for, in the human review process. So this is, the overall, architecture, for, way for accelerator with, Numbering around it. Let me show you a couple of things there. so in terms of resources, we have created a. in here. So I've co-authored a blog with my colleagues, British Patti and Rohan, goosh. we have also released, this way for accelerator sample under AWS samples. It means, if you want to explore more, you can read through the book, which provides a lot more detail than the demo. And then, you can download the samples and, explore that. So let me show you that blog very quickly. So this is a blog, I've shared the link for that. You can have a look. And this is the AWS sample, which has got lot more, and detailed information on how to go about deploying and, exploring the sample. So this is, the block. I'll just show it to you. So before we conclude, let's, have a quick recap. We started with, AWS well architected framework, the pillars and lenses, scaling opportunities as you expand your cloud footprint. And how generative AI could provide answers for some of the scaling, opportunities for the reviews. And we, I showcased you the way for accelerator, which, demonstrate, the art of possible, when it comes to, reviewing using Gen ai. So this brings me to the end of this presentation. I thank you all for, joining the session. Thanks. Bye.

Slides

Download slides (PDF)

See all 40 talks at this event!

Conf42 Large Language Models (LLMs) 2025 - Online

March 20 2025 - premiere 5PM GMT

Accelerate AWS Well-Architected reviews with Generative AI

Video size:

Abstract

Summary

Transcript

Slides

Shoeb Bustani

Senior Enterprise Solutions Architect @ AWS

Join the community!

Featured event

2025

2024

Info

Conf42 Large Language Models (LLMs) 2025 - Online

March 20 2025 - premiere 5PM GMT

Accelerate AWS Well-Architected reviews with Generative AI

Video size:

Abstract

Summary

Transcript

Slides

Shoeb Bustani

Senior Enterprise Solutions Architect @ AWS

Join the community!