Using Azure AI Foundry to manage all your Large Language Models

Video size:

Abstract

Microsoft embraced OpenAI for their Copilot and Azure AI solutions early 2024. But did you know, now more than a year later, you can deploy several other LLMs from different vendors, using your trusted Azure environment? Thanks to Azure AI Foundry.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Hey everyone. Welcome to com 42 on large language models. My name is Peter Deten, technical trainer at Microsoft Live, presenting from Redmond, Washington. My session today covers using Azure AI Foundry to manage all your large language models. With that, I would say welcome and let's go. Over the next 45 minutes to an hour, I will cover the following topics, starting with a quick overview of the future of work with ai, followed by domain section, how Azure AI Foundry is used by Microsoft itself to run AI solutions and obviously how you can leverage the same. And then last, how to bring in a rag architecture, meaning using all the beauty from Azure AI together with AI Foundry. Generative AI using large language models, but then also integrating your own data. And then for each of these topics, you can expect quite some live demos during the session as well. So again, my name is Peter d Tander originally. Originally from Belgium, but ated to Redmond, Washington to continue my job as a Microsoft technical trainer. I've been a Microsoft trainer for about six years now. before I joined Microsoft in a full-time employee position, I was already working for them as a partner and vendor out of my own company for about seven years. My Azure background has always been on the infra and in architecture side, but gradually I shifted more into DevOps developing Azure solutions, and nowadays, obviously a lot about AI and copilot. Feel free to reach out if you should have any questions using any of the listed methods. Good. So with that all out of the way, let's jump in. Now. When I talk about AI in any of the Azure and AI workshops I'm teaching, I always like to start with a quote from our CEO Satya. He explained it as follows. Organizations are asking not only how, but also how fast they can actually apply this next generation of AI to address the biggest opportunities and challenges they are facing. Good with that all out of the way, let's jump in. Now when I talk about AI in our workshops here at Microsoft, I always like to start with a quote from our CEO. Satya. Organizations are asking not only how, but also how fast they can apply this next generation of AI to address the biggest challenges, but also opportunities they face safely and responsibly. Now the core of the quote will become more clear by the end of my session, so that, let's start with the foundation, understanding what the future of work with AI looks like. For the last two years, give or take, organizations have primarily been experimenting with different models. I think it is safe to say that OpenAI with Jet GPD revolutionized the world with Jet GPD again, end of 2023. But from there, several other players came into the market like Gemini, from Google, and Tropic with Claude. deep seek from China, Microsoft having the five, models, menstrual, hugging, face, and so many other. Now what this slide represents is how many different models are actually being used by an organization. The average, you could say here is, 3, 2, 5, so probably four as the average where no one on the interviewed organizations was just relying on a single large language model for any of their AI solutions. Now also important to highlight is that about 80% of early AI projects actually fail because not meeting expectations and not meeting them because they're too complex now. Complexity and the fact that especially generative AI is still rather new technology. And also providing a breadth of large language models to choose from, but then also. The fact that applications are changing, moving from single models into orchestrated systems, allowing to learn and adapt continuously. Customers and users are expecting AI influenced, or you could say AI inspired capabilities. In almost any kind of application today across different industries and across different use cases. Now, even amidst these challenges, it's clear that generative AI is what makes applications through the intelligent, but that's also a paradigm shift. AI is moving from this, I don't know, autopilot phase, which was all about narrowing purpose-built tools that use machine learning models. To now come up with predictions recommendations. also just automating to now having this co-pilot era where there's tremendous opportunity to really revolutionize how just about everything can start using those intelligent applications. You can now enable natural language interaction, constantly improving user experience and quickly delivering new features and capabilities to the market. So with that, let's shift gears a little bit and talk about Azure AI Foundry specifically. By the way, one of the reasons Microsoft has moved such fast pace over the last few months is because our Microsoft AI solutions within Microsoft are all running using Azure AI Foundry. Now we've built our Microsoft copilot reaching in meantime, millions of users across the globe. Informing them across different platforms using, for example, just copilot on the mobile device in the browser, using copilot web, using copilot within Microsoft 3 6 5 applications, security copilot dynamics copilot, Azure co-pilot, and GitHub copilot. All of these co-pilot are a hundred percent begged by Azure AI Foundry. Which now also becomes available to you. So you might have heard about Azure AI Studio before, where it's still the same foundation, I would say being the one-stop shop, like a management portal for developers IT sales admins, cloud admins. If you want to create your own custom copilots leveraging ai, Azure AI services. So insured Azure AI Foundry is a trusted, integrated platform designed for developers, IT admins, cloud architects, allowing them to design, customize, and manage AI applications as well as agents. Nowadays, it offers a rich set of AI capabilities and tools, and yes, I'll walk you through a few of these in a demo using an easy to use, easy to navigate. Portal, but there's also a unified SDK providing you APIs to really accelerate the path from developing to production. Now, what sets Azure AI Foundry apart is the accessibility through the world's most loved developer tools, meaning GitHub Visual Studio and co-pilot studio. This really integrates with a lot of other scenarios. So what we see here is continuously building up on Azure AI Foundry being this open, flexible, modular platform, sorry, with a lot of the tools a developer needs in a single platform to build multi-agent solutions, integrating third party tools. Just like we've done with our own models. This also means that developers now get access to Quick Start AI application templates to support their development cycle and really shortening that complexity that I talked about before. Any of these templates can be customized using a wide area of already existing models and tools, making your applications future-proof development investments. Another part I wanna talk about is it governance. So IT governance at scale, like what we call here, enterprise setup is baked in AI foundry. It allows you to provide a comprehensive approach providing self-service experiences, customizable configurations, and overall enhancing agility, security, but also keeping compliance in mind. So to give you an idea, there's obviously prebuilt baked in. Azure and Azure AI Foundry, role-based access control, including like owner having you, giving you all permissions, including changing security permissions. They're a little bit lower permission contributor, but there's also reader ai, developer and AI inference deployment operator roles. Little bit complex there. but what it means is like for any kind of responsibility as part of your development cycle. You can have, a corresponding role-based access control. Next to that, all these roles, are accessible from within the same AI foundry where we now use a topology called the hub and projects out of the hub. It's like the highest level in the topology. You're gonna allocate one or more projects, and within a project you're gonna define. The different, permissions On top of that, you can think about other features, mentioned here on the diagram. Identity, access management, network security, data protection, encryption, compute, storage, quota, access to the models. All that is now becoming available within a project and within a hub. Mainly avoiding that each and every developer when they're working on an AI inspired application that they have to deploy, like the full architecture themselves. So you would almost say that your cloud team is now building the hub and project. And developers are consuming the services, the building blocks within a project, and then on the outside. And then on the outside we have our business it. And from there we could also expand with IT security because in the end, it's still part of the broader Azure world, which means that everything you might already know. From Azure Tenants, Azure subscriptions management groups, like the whole governance layer around it is now also still valid once you start deploying your AI foundry and corresponding Azure Resources. Apart from the core AI services dependencies extended with other Azure resources like E Vault, private networking, and the like. You can now also easily, I would say, provide access to those external resources from within AI Foundry using a feature called Connections. So again, it means that instead of needing to provide access to your development team, your IT Cloud admins to all those resources, it's now becoming just another aspect of your AI Foundry landscape. And these connections are typically already linked to multiple projects within a hub. And with that, let's shift to a first demo here where I'll walk you through the base deployment of Azure AI services, as well as a first look at Azure AI Foundry, showing you the hope and project, as well as how to add and manage your connections. So my starting point here is my Azure subscription, and I got everything inside my, sample com 42 Foundry resource group. I already deployed most of the resources that I need, but my assumption is that you already know how to deploy some resource in Azure. So not all that important. for now. One of the building blocks I have is an Azure AI hub. I'll talk about that a little bit again later on. Within, we got a project, the core that I'm using here is Azure AI Services, and I got a few site services over here, like a container registry and Azure Key Vault log analytics for my monitoring. So the starting point will most probably be your AI service, which means that you would go into the Azure portal. From there, you would deploy a new service called Azure AI Service, and that's pretty much it. Now, from a development perspective, there are a few things that you need to keep in mind for connecting to your AI service. Primarily, I would say here, endpoints and your keys. So when we move to endpoints, what we have is obviously depending on the different AI use cases. In my example here, I'm using OpenAI to really have that generative AI capability available as an endpoint allowing me to integrate later on my large language models and so much more out of my foundry interface. I. If I wanna interact with other AI services like the more traditional ones, right? Compute, vision, content, safety, language, translator, I can again find all that information up here. Now, the reason why, we have our AI Foundry as a portal is because in the end from here, what I'm doing is just managing the Azure resources. So you could almost say that this should be a part. That maybe your developer is not really seeing anymore. Why not? Because for them, everything that is related to management is now moved into AI Foundry. By the way, if you navigate back to the starting point AI service, you can see down here there is the Azure AI Foundry portal link. So let me move over to that one, and that's where we are now, landing in our Foundry portal. Now, there's a few different ways you can. You'd end up here. So I talked about the hub and project, right? But when within my setup here, you won't really see that. So depending a bit how you get there, you're gonna see that. Right now here I'm in my Azure open AI service because for this scenario, I started from deploying an open ai Azure service. now if I navigate to my other scenario, I'm now in my Azure AI foundry, homepage you could say, which by the way, you can navigate to from ai.azure.com. And within. As you can see here, I do have my hub and my project. When I navigate to my hub, you're gonna see some highlights from my Azure subscription. So down here I can still see my Azure subscription. I could switch back to manage it in my Azure portal, but now I can also, as a developer, picking up the AI endpoints and keys that I need from here. So instead of now just having my open AI service, it's not. Now switched, you could say, to Azure AI Foundry Management Center. Within the management center, we have obviously the overview of our hubs and projects. And again, the logic is that you would create one or more hubs as the higher level in the topology and underneath you would create one or more projects within a hub. So that's the relationship you can see from here. I got my AI project as part of my. AI hub. That's the logic behind it. Staying within the hub level for a minute. I talked about RAC, role-based access control, where now you can see I got my AI admin and a little bit lower. I got my Azure ML data scientists replicating that you have someone managing the AI service and you might have some data scientist on the other side interacting with the storage, providing, the data endpoints most probably. And then from there, allowing my developers to start interacting. You can manage your quota. So if I switch back up here, one of the, I would say. Configuration options you have within Azure. Later on when we start deploying our models is that you need to know a bit about the models. First of all, you need to know how much tokens, like the virtual AI currency, I would call it, your application would need, but also knowing that each and every Azure region together with the models, also provide you a quota. So in my setup here, I already have a few scenarios like models deployed and I'll show you in there. Next demo how to actually do this. But what you can see here is that I deployed GPT-4 oh. This is the version I'm using most, probably the latest one, and I'm allocating 8,000 tokens per minute. We're now, for the total of my Azure region, there's 450,000 available. So depending on the needs of your application, you're gonna allow less or more numbers of tokens or thousands of tokens. Because they heavily influence, the richness, I would say, of the prompts your users can run, but also, how complete the actual, prompt response can be. Where at some point you might actually run out of tokens, and that's where now in the quota you need to validate what your Azure region, provides. So that's the first, I would say high level scenario on how to actually navigate across your. AI foundry starting from the Azure portal, deploying an Azure AI service, and within navigating to Foundry. Where next you have the option to use the hub and, project topology allocating REC. And then in the next step, we're gonna allocate our actual, language models. But for now, back to the presentation. While most of my demos will happen from the AI Foundry portal, I understand that most developers will probably like to interact from a, I don't know, development interface, and that's where an SDK comes into the picture. So the good news is that there is an Azure ai, SDK specifically available. To equip developers streamlining AI integration and really enriching that user experience and building in functionalities into their applications. This toolkit supports multiple programming languages, Python, c sharp.net, Java, you think of it, you name it, and it is supported, really enabling developers to select the language of choice and making them more productive. While developing, generative AI inspired applications. So from there, developers can efficiently build, evaluate, deploy those AI components. The SEK integration is obviously part of the already trusted development environments, mentioned before, GitHub Visual Studio vs. Code, and you're gonna interact with your AI Foundry not from within the portal, like I'll show you in the demo, but obviously from within the SDK integration directly. With the base Azure Services and AI Foundry up and running. Let's talk a bit about the large language models, which by the way is the topic of the conference overall, right? With Azure AI Foundry, you get access to an extensive list of large language models. Really allowing you as the developer or obviously your customers, to choose the models that make most sense for your solution. Now, while this list here on screen might not feel extensive. Know that there are more than a thousand different models. Yes, more than a thousand to choose from available today, which can all be deployed within your AI foundry. Even I would say deep seek. One of the more recent, large language models integrated only a couple of weeks ago. It's already available, and we got more on the list coming out in the near future. So for that, let me jump back to my demo environment and show you how easy it can be to search for models, deploy the models, and also covering a few other, I would say, baseline tasks once you start using models and all from within AI Foundry. All right. So with that, let's have a look at, the actual large language models can be deployed again from within our AI foundry. So I'm still at the screen where, I finished my previous demo. There was actually one part I forgot to show you. And that's, the connectors or connected resources. Now, if you think about how a developer interacts with. The AI service, as I showed you, there's the endpoint, but there's also the keys. And later on we'll talk about rag architecture, which means you're probably gonna integrate, like AI search. And then in the last part there's a little bit, I'll talk about content safety, making sure that your application is following our Microsoft responsible AI framework guidelines. So all of these are standalone services, standalone resources in the Azure platform. We're now again, in that mindset that your developer will probably need access to all those resources. Instead of giving them access from an Azure perspective, you could now allocate them as an available resource for the hub or an individual project. And that's mainly what you could manage from here. So just to clarify the interface, if you've never really seen, AI Foundry in action, the highest level AI foundry is available from ai.azure.com and underneath you're gonna create a hub. Within the hub. You later on gonna create a project that we already talked about, and then from there we interact with our connected resources. So from here I could interact with new. I talked about Key Vault. I didn't really, deploy it yet, but you could interact with OpenAI, with your speech and other, some other services. And if you want, you could actually directly connect to other service building blocks in Azure or even outside of Azure as well. Now, from here, we can shift to our models. Now, there's a few different ways, I would say within the portal to do this, depending if you are active on a project, active on a. Are active in the open AI services that I showed you at the start. From here, I navigated inside the hub and I'm now inside this specific project where you can see, as you already know from my previous demo, I already have some of my connections available. So what this means is that, for example, my cloud admin pre deploy the connections and they're now becoming available. But imagine that I also wanna deploy my own specific. That I only want to use within this specific project. So I could do this again from here since I'm already in the project, right? I could manage it all from here, or why not? I could open up my project in a separate blade. And then from here I get again, my models and endpoints. So it looks about the same. It just depends on how you're gonna navigate to it. So let's show you how we can add a new model, and this is now giving me access to all possible models within our environment. So you can see I talked about, a bit more than a thousand, and we actually have almost 2000 available in there. So you can start with pre-built collections. So if you want, you could filter based on only the ones from OpenAI. Like an easy example and then the list will only obviously present those ones. Or if you wanna filter based on what I can do with them and then maybe some specific features, like I only wanna have the ones supporting chat completions and overall completions. Then you can see that there are, open AI ones. The Microsoft five one, there's minus one, there's Lama from Meta, and again, all the other ones showing up here. So the next step now is selecting your model. It always provides you a pretty detailed description, what that model actually represents, and then you could also define some of its, capabilities. From there, you would confirm, now, since I already deployed this, it's not gonna allow me to deploy this anymore, so let's. Try and select another one here, GPT-4 oh. Still one of the more recent ones. Again, explanation, sharing information about the latest, version you could say, and then some description what it actually allows you to do. So I'm gonna confirm, and the next step is now defining the specifics for my project so I can choose the deployment type global standard. It means that you're gonna pay per API call. And if you want, if the model supports it, there's a few other ones as well. I would say if you wanna know more details, then please consult our, Microsoft documentation, because I don't have the time in this demo to expand on all of these. But it has a lot to do with. The high variability aspects of the Azure architecture in the backend. Next, in our deployment details, we can fine tune, customize this a little bit, so you've got the different model versions. some versions up and down. You could say typically, my guidance is to use the most up-to-date version, but there might be specific use cases where maybe you don't have to or you don't want to do that. If you wanna integrate it with one of your previously discussed connections, you could do that as well. And then defining the tokens per minute rate limit. And again, this influences the use cases of your application if you wanna integrate content filtering. So again, the content safety, right? That's where, you could at least for now, pick the default, because I'll talk about this, a little bit more towards the end. So you pick your model, use, confirm some of the settings, and from there you're gonna. To run the deployment. So pretty easy. I would say pretty straightforward. Once we have a model deployed, like in this case, we typically provide you some starting points. And again, we're targeting developers, right? So what do you need here is. Azure credentials. Installing like Python example in this case, and how to actually start testing, validating, and it's almost literally copy pasting. If Python is your, language of choice. If you go oh, actually I want to use another language, I'm gonna do C sharp, bam, within just a split second, it's gonna give you the different steps, how to import the necessary packages. To install Azure AI and then obviously the identity, and then from there, creating a point to your AI service endpoint. What's important here is the name of your endpoint together with the deployment and the name of your deployment. And then you're gonna tune your chat conversations. I'll talk about this a little bit later on. And then if you want some additional samples, how to interact with, conversational context, how to keep the history and so on. Another angle to validate some model information is now I moved from project into the hub. The way to deploy is obviously, a hundred percent the same. So if I wanna search for LAMA instead of using any of the pre-built categories and filters. You could now interact with, any of these models. So the baseline is the same on the hub level, on the project level, so nothing really different. Another scenario that I would like to highlight here, this is the playground, but I'll talk about the playground, a little bit later on, is inside. Our model catalog. From here, it's basically giving you the exact same view. Now, where it is slightly different is that again, here I'm inside the open AI service specific option. And again, I like to highlight that it's all based on AI Foundry, but every now and then, depending on the model you deploy, it might show you more or less, integrations. If I now move to another open model here. You can see that it also allows me to deploy it. It still provides all those details, but now I could also look into why should I maybe deploy this if one of my colleagues already deployed this scenario? So giving you, again, quite some, some options. Another question that we sometimes get about the model is getting access to metrics. Now this is not showing you, like specific details I would say about the model, but more about how your applications, how your developers are using, targeting this model. I just deployed this without actually having a sample app in the front of it, so that's why my metrics are not really impressive, right? But it should give. You some idea that auditing monitoring is also baked in. And then a bit on our Microsoft security framework or the responsible AI framework. We have, since I deployed my model using a default filter, it's already having, some content safety as part of my language model baked in. And obviously for now, because I didn't use it, there's no violations, there's no abusive use and so on. But again, giving you a pretty nice idea about, what you could use it for. Another example I can show you is using the model catalog. And again, this is the same catalog, what I showed you before, it just giving you the full list and a different view. Why did I switch yet to another view? Because there's another question that we can answer from here and that's what is, the model? Supposed to help me with, or also answering the question, what model is better for a specific purpose than another? Since I got deep seek up here, I'm gonna check out that model. And same as before, providing a description, but now it also gives me a link to benchmarks. So by design, what it's gonna do is looking at, I would say its own, core competition, right? 'cause as a developer. You need to think about all these different language models, and I'm pretty sure that the conference sessions will help you with a lot of this. But then within our AI Foundry, knowing that you get access to more than, 1800, now, how do you decide across those language models? So that's where now you can easily select any of the models and there will always be benchmarking. In this case, it's comparing the deep seek feed three that I selected with some of the other popular counterparts. Parts, but nothing blocks you from also integrating a comparison that you can manage, that you can tune with some other models. I think with that, you should have a pretty good idea about, using different parts of the AI Foundry portal, how to deploy the different models using open AI service, using other services within the hub and project, and then also talked about. The higher level model, catalog, and the benchmarking, so that, let's switch back to the presentation. Nice. So we're now at the point where we have our large language models deployed. We did some benchmark testing and I briefly talked about some other capabilities, like the fine tuning I briefly touched on. So I can hear your next question already. coming up. okay. AI Foundry is the go-to portal I use for generative AI with large language models. But what about all the other more traditional AI services that we had in the Azure portal in the past? I would say that's actually a great question. Now remember in the introduction section we talked about the paradigm shift in application development where customers and users somewhat expect to have AI capabilities. In basically any kind of application they're building or using. While generative f AI seems to provide a lot of those capabilities, all other AI services you might know from the past existing in Azure already are now also gradually moving into AI Foundry as that management, AI services portal. For example, AI search for architecture, although I'll talk about that later on, but also talking about AI speech, AI Vision, document intelligence, language translator, all those services have been around in Azure for close to 10 years and can now gradually be integrated and managed from within AI Foundry as well. Now the magic behind the scenes to manage all this is Foundry playground. I already showed you a little bit of this from within our AI Foundry portal, especially around a large language model deployment, but there's actually a lot more you can do with it. I would say let's have another look. In this next demo, I'm gonna walk you through AI Foundry, showing you some of the capabilities around playground, allowing you to test and validate your models, and also switching to some more traditional AI capabilities. Good. So we have our models deployed in the previous part of the presentation, which now means that we're ready to actually test and validate. So the way to do this is from within once more AI Foundry, we're now selecting playgrounds. You could do this directly navigating to ai.azure.com/playgrounds, and then it's gonna pull up your project. And allowing you to actually build out your playgrounds. Remember, we're primarily focusing on generative AI using chat co conversations, but remember that we do have other AI services as well. So if you wanna integrate with like speech or the newer agents based, like auto gen alike syns, or you wanna integrate with like images using, for example, Dali or some other large language model using. Language playground. And again, there's so many different ways to start testing interacting. The most obvious one is using the chat playground. So in the previous part of the demo, we deployed our GPT model as part of our hub and project. I got two of them linked to different connectors. So again, emphasizing that all those building blocks from the start are still nicely coming back here, I could start easy. I got an AI assistant pretty. Generic message here, and I'm gonna ask something like, what is the typical weather in Seattle mid-March? now it's nicely responding in mid-March and so on. Obviously it's not about, the details, although in a real life scenario, I would obviously encourage you to really validate and verify the actual response. Now, again, based on your model, what do we get as a confirmation here? Is that the model is working, it is responding well. Now, if I would ask something like, can you provide Belgian food recipe? Remember my Belgian roots, it's gonna come up with some ingredients, and again, it's still doing a pretty good job. Let's see. It came up with a, so mussels and fries. Perfect. Like a typical Belgian dish. It provides me the ingredients, and next to that, it also provides. The instructions and I'm actually getting a little bit hungry now out of that, but not there yet because I need to show you a few other things. Now, back to our starting point. I got another project here, so quickly switching where this time I got two different language models. Now remember, the language model is trained on public data up to a certain point in time. Easy set. GPT-3 five is. Less up to date than GPT-4. An easy example I can use to show you how this works is who is the Prime Minister of the uk? now it's gonna tell me based on a date. October 21. The Prime Minister is Boris Johnson. Pretty cool. I got my answer. Where now if I switch to my GPT-4 model and I'm gonna run the same prompt, I'm just gonna copy it from here. You can see that now the answer is different and not only different in context. Like the actual answer I would say is. More up to date, but also just by switching the model, you can see that now I get access to more tokens, a short answer, a pretty, little bit longer answer. And if you wanna tune all this a little bit more, that's where you can go back to, the previous part of the demo. Another thing that's pretty cool is interacting with what your chat actually can do. So right now I used an easy example. This could be used for I dunno, writing an essay on UK politics or something. I don't know where. Now we could tune this a little bit more, where I could do something like, you are, Microsoft. Technical trainer with expertise in Azure and Azure ai. You can answer questions as long as you keep it polite and professional. There we go. You cannot answer questions about Amazon AWS or Google Cloud platform to apologize in those scenarios. Offer a Belgian beer flavor to try out instead. So now what I'm doing here is tweaking, fine tuning if you want. I use case where this could almost be like a virtual Microsoft technical trainer. I'm gonna run a new chat conversation and from here I could do, can you explain a bit about Azure Kubernetes service? So it comes back as expected with a nice detailed prompt. I'm using GPT-4 oh latest version, quite some tokens available in my subscription when Now from here I could ask, what is the use case for AWS Beans stock, like an existing AWS service. When now it tells me like, oh, I'm sorry. So I asked it to be professional and polite. I think that's, pretty well covered here. Unable to answer questions about AWS or any other AWS service and then however. To make up for it. Again, being nice and professional, I could recommend a Belgium beer. How cool is that? You can actually tune your environment, your chat conversations, using by the way, what we call the system message. And then as a developer you would integrate that, obviously as part of your code as well. And then allowing the chat conversations, going back and forth. So that's pretty much it. Out of this demo showed you. How to interact with the playground, specifically this one for chat or going back to the previous part of the demo where you can choose across different playgrounds depending on use cases for images, integrating audio, integrating other, assistant alike scenarios back to the presentation. Sweet. This brings me to the last topic in today's session covering rag architecture or retrieval augmented generation. Now, we heavily discussed large language models and how they enrich your applications with generative AI capabilities. For example, conversational context, natural language, and primarily chat driven. One main characteristic for most LMS is that they're trained using public internet data, which obviously is totally fine, but what if your app needs to know about your organizational data specifically, or your customer? Your user can only use internal company specific information, not using the context from public models. That's exactly what RAG or retrieval augmented Generation is about. Starting from a large language model approach, you are now augmenting enriching if you want the dataset your AI application is using within your own data. Which can be images, documents, knowledge bases, and they can be stored within the Azure backend. And that's really the next step, right? So if you are wondering like where does this data come from, like how do I integrate it? There's a lot of different options. So the easiest one I would say is using Azure blob storage. But nothing blocks you from integrating with, for example, Amazon S3 buckets as well. And yes, I can show you also how to upload files manually in the Azure portal, although I don't think it's like an enterprise recommended approach to do that. The backend interaction happens out of Azure AI search, building up what we call vector information for each and every of your data sets. And then once the indexing is done, your AI application will be able to respond to prompts in the same way you expect from your large language model, but now using knowledge coming in from your own data. And you can expect it also brings me to the last demo here, showing you what this rag architecture looks like and how to use it from within ai. So regarding the, rag architecture, the main idea we wanna do from here, and again, I'm just using the playground again, is importing or linking it to our custom data. So I'm, inside my project and I'm gonna. Select here a new section where I could define other system messages, but that's not specifically to re anymore. So we're shifting down to the next section, and that's using your own data. We can add new data sources and for now, I'll make it a little bit bigger. You can choose from any of these using Azure AI search a service that's been around in the platform for a couple of years already and now nicely integrating. With, the generative AI capabilities of what an AI solution is, offering today Azure blob storage, where obviously the idea is to have your data sets like, PDF documents, images, word files, any other text files, json alike scenarios in Azure blob storage. Remember, you could link this to your hub and project, making sure that not. Everyone needs access to the blob storage. I would also recommend using, role-based access, preferably managed identities to interact with those data endpoints. Azure Cosmos database, our non-relational database here, which could be a perfect, data set or database endpoint interacting with elastic search. If you want, you could straight point to your web address connecting to, a knowledge base. Website, intranet, maybe SharePoint alike scenario, or, why not uploading files and then still allowing you to turn them into an index. So quite some options. So imagine we are gonna go for blob storage. You need to identify your subscription, the storage container, the BLO storage, obviously. And then if you wanna use a specific Azure search, you're gonna define the index. And then from there. How frequently is my information changing, which influences my index? So from here, pretty straightforward. You point to the data endpoints and that's pretty much it. If I switch here to uploading files, it's still gonna ask for storage. So you still need to have that storage background where now it's gonna enable cross, origin to allow me to interact from my AI service here, my playground into. My, blob storage container, I would still need to identify the index, so this could be the live PDT index, where now I can basically from here start uploading my files like a Word document. Where for some reason it didn't pick up my REC that I specified like a few minutes before the recording, so I'm gonna save this for now. But you probably get the idea, and again, most probably you're not gonna do this from here 'cause you're gonna ask your data team, your data admins, your data scientists, maybe to upload the data in the Azure storage or in a Cosmos database. But then the way to interact with it from here would be basically the same. That's, I would say mainly what I wanted to show you. Different capabilities, different ways to interact your GPT alike scenario, at least in this case, your overall AI inspired scenarios. Instead of just using the large language model, how to interact with your own custom data, where again, you got a multitude of sources. I am close to running out of time, so I guess I'm gonna switch back to the presentation for the last couple of words before closing the session boundary. Now, before we wrap it up here, I wanted to highlight one other actually crucial aspect in developing AI solutions. And that's our Microsoft responsible AI framework. In short, we keep responsibility high across all levels of our AI development cycle and also in our AI and copilot solutions. Now, for example, we don't train on your data. We also do not use your prompts to train the models, and eventually your data remains your data. Specifically within AI Foundry, you can manage an integration with our responsible framework out of Azure AI content safety, which is, robust, safe content moderation platform. Leveraging AI to ensure that whatever you are building keeps safety, in mind, and also enhances user experiences. It integrates with a lot of powerful AI models. Allowing you to detect, but also eliminate inappropriate content. For example, hateful, sexual, violent, or self-harm inducing responses. Using filters and safety thresholds, organizations can tailor content safety measures to, again, make sure that once the application goes live, a lot of the safety measures are already in place. This brings me to the end of this session where I would like to highlight several of our top use cases for Azure AI Foundry, starting from deploying your Azure AI services and AI foundry hubs and projects. Starting the development project governance. You would use AI Foundry to deploy, test, validate your large language models of choice. From there, you're gonna primarily use it, out of the playground that I talked about and showed you. To validate the models, configure system messages, validate your prompt responses, and then last fine tuning the model, and then obviously integrating your safety net, your responsible AI as well. I hope with this session I managed to walk you through a sneak peek of what Azure AI services using AI Foundry can do for your organizations, especially when you are thinking about developing your own custom co-pilots, as we like to call it. I wanna thank you for having joined me in this session. I also would like to thank the com 42 team for having invited me as a speaker to present at this conference. I hope you enjoyed the rest of com 42 large language model conference and hopefully we'll meet again soon in any of the other com 42 conferences in the near future. Take care for now. Have a great day and don't hesitate reaching out if you should have any more questions. Take care my friends.

See all 40 talks at this event!

Conf42 Large Language Models (LLMs) 2025 - Online

March 20 2025 - premiere 5PM GMT

Using Azure AI Foundry to manage all your Large Language Models

Video size:

Abstract

Summary

Transcript

Peter De Tender

Business Program Manager - Azure Technical Trainer @ Microsoft

Join the community!

Featured event

2026

2025

Info

Conf42 Large Language Models (LLMs) 2025 - Online

March 20 2025 - premiere 5PM GMT

Using Azure AI Foundry to manage all your Large Language Models

Video size:

Abstract

Summary

Transcript

Peter De Tender

Business Program Manager - Azure Technical Trainer @ Microsoft

Join the community!