Beyond Pre-training: Scaling Cloud-Native LLMs with Contextual Data for Real-Time AI

Video size:

Abstract

Static LLMs are outdated the moment they’re trained. Want AI that’s real-time, relevant, and enterprise-ready? Learn how RAG, CDPs, and cloud-native architectures unlock hyper-personalized AI, boost engagement by 3x, and solve data fragmentation at scale.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Hello everyone. I'm Manke. I currently work at Twilio as a director of engineering, where I lead teams working on predictive and generative AI products. I'm excited to talk with you about how we can enhance performance of large language models by incorporating contextual data. We'll also discuss how this approach moves is beyond traditional pre-training methods, creating much more powerful and personalized AI experiences. So first, let's start by exploring the strength of large language models. These models have impressive capabilities when it comes to generating content. They excel in conversational inter interactions, making user interfaces feel very natural and intuitive. In fact, this interface. Has been key to democratizing AI across different user personas. A prompt is now valent of a model. They're also incredibly versatile and performed remarkably well across various tasks, including summarizations, generating quote snippets, crafting compelling copywriting, and more recently even advanced reasoning tasks. Despite these impressive strengths, LLMs also come with some critical limitations. For one, they're primarily tamed on static data sets. This means the information they have access to can become outdated or incomplete. Additionally, LLM sometimes generate incorrect or fabricated information, what we often refer to as hallucinations. Perhaps most importantly, without contextual informations, LLM struggle to deliver genuinely personalized responses limiting their effectiveness in many real world scenarios. Let's look at a practical example of a support chat bot to illustrate this issue without appropriate context. A large language model might respond inaccurately or irrelevantly because it doesn't have access to the necessary information to tailor its response Effectively, this limitation underscores the importance of incorporating contextual data. The solution to this issue is generally called retrieval, augmented Generation, or rag. The core idea behind RAG is pretty straightforward. The idea is to retrieve relevant contextual information from various data sources and integrate it into the prompts we give to our large language models. This process alone significantly improved the models responses, making them a lot more accurate, timely, and personalized. Rag also enables the inclusion of current data. it can help inclusion of specialized domain knowledge, even proprietary knowledge basis can be included as part of the prompts. And most important in context of today's discussion is its ability to add custom, personal, contextual information into the prompt to drive hyper-personalized AI interactions. However, as we've seen this example, just simply combining LLMs with a very generic knowledge base isn't enough to drive depersonalization users expected. We need something more targeted. For instance, if you look at this example, even though the sup, the AI support agent has access to documents relating to the return policy. It doesn't have the personalized context of the user leading to a somewhat poor customer experience. And this brings us to the concept of customer data platforms or CDPs. We'll see how CDPs, which are traditionally known for marketing use cases, can enable personalization at scale effectively by providing rich, personalized data that can be used directly by AI systems. And before we dig into more details on how A CDP can be a powerful source of contextual information, let's first understand at a high level what exactly is a customer data platform. In simple terms, A CDP is a sophisticated tool that gathers and organizes customer data from a variety of different sources and interactions. And send it, sends it to, downstream apps and destinations. The sources could be your web app, your chat bot, your, POS systems or even your warehouse. And the destination could be the apps where you want to activate this data, such as marketing automation platforms, ad destinations, et cetera. CDP is increasingly also create a unified and very comprehensive view of each individual customer. enabling businesses to understand the 360 degree view of each customer in real time. There are also nuances like identity resolution across different platform that help build rich user profiles. Let's see how this would work in action. So as the user interacts with different applications. The user profile gets hydrated, allowing for deeper personalization for subsequent interactions, and now in this example, you can also see how the chat experience also becomes very seamless as the profile gets richer and richer. So what we are seeing is that in the context of AI applications. CDP can act as a very powerful personalization engine. It allows AI applications to access relevant and personalized data effortlessly, and thereby significantly enhancing the user experience as we see with the example that we have been taking so far. Depending on the context of the information, all you need to do is pull relevant information from the user profile. Add to the prompt for the large language models and then you're able to generate a much more personalized response. In fact, this is very similar to how the memory feature in chat GPT works. So if you have noticed within chat GPT, you can add a lot of your preferences or information about yourself as memory. And what that does it, it enables the models to maintain context. Ensuring consistent and relevant interaction over time, but CDPs can go a lot further in terms of providing a more comprehensive context across different digital touch points, whether it's an email interaction or an action you performed on the mobile app. So with the CDP and with the cost of inference. In large language models coming down dramatically, we'll find that companies can unlock capabilities that require personalization at scale. These would include one-to-one personalized messaging, dynamically personalized web experiences, and highly effective customer support interactions. Imagine having a customer support that instantly understands your context without needing extensive background information, and that is exactly what a CDP can deliver. so far we have learned how A CDP is sufficient for delivering personalized experiences. But now I wanna focus on how A CDP is also necessary for improving personalized enterprise applications. And in particular, I want to emphasize is why you need a more sophisticated data infrastructure than just a key value store to drive this personalization. The first reason is very simply that CDPs help break down information silos. It's possible that customers, make purchases on different platforms, and these events are collected at different touch points by different teams. Similarly, it's also possible that one team manages the web app and another team manages the mobile app. And the behavior is captured differently across these two platforms. Having a unified way of collecting customer data and unifying that data into customer profiles is core to unlocking value from this data. Using AI with real time data capture and identity resolution, you can build a pretty rich customer profile for each of your users. And finally, CDPs can also provide the much needed data governance for different types of data sets, depending on their sensitivity. So now that we have established the need for personal context and how CDPs can fulfill that need, let's focus on different ways of implementing this with the CDP. So one potential or a naive solution would be. To embed all the available customer profile information in each LLM prompt. While this seems very comprehensive in the sense that you would be getting all the data that you have about that customer, it often leads to very large prompts, and thereby it increases the computational cost and latency and ultimately makes the solution pretty impractical at scale. So depending on how large your user profile is. You could have a lot of characteristics you're storing about your users, a lot of behaviors you're capturing about your users, and as a result, this user profile could be pretty huge. So now let's look at another way of doing this. So another approach of doing this would involve selectively using only specific and predetermined customer traits or events from the user profile. For instance, if I am talking to a customer bought and asking for a restaurant recommendation, if you know that I have stored, the user's location as part of the user profile, I can fetch that and use that to provide more tailored, restaurant recommendations. While this helps reduce the prompt size, it also severely limits the richness of the personalization experience. So first of all, it misses any other relevant information that may be available as part of the CDP or the user profile. For instance, it's possible that there's another team that is through some other means recording favorite qz for each user. Now, because as an engineer, while implementing this feature, I did not know that this particular trait existed on the user profile. I won't have a deliberate way of including it in my implementation. The other challenge is it actually requires you to have this comprehensive knowledge of what is stored in the user profile and the semantic understanding of everything that exists on the user profile. And this brings us to our final, recommendation for implementation. Is to dynamically retrieve just the right amount of relevant customer data based on the specific interaction or the application context. And how you do it is you take the natural language prompt and you calculate semantic similarity with different data elements that exist on the user profile in real time, and fetch that data and inject that data into the LLM prompt. So this solves the problem of having a smaller context that you're injecting in the LLM prompt. It also solves the problem that you're able to use all of the data that is stored in the user profile without actually having any prior knowledge of what existed in the profile because you're doing the semantic similarity check at runtime. So this method is pretty efficient, flexible, scalable, ensures optimal use of all the data that you have without compromising on the personalization quality. So let's wrap up our discussion today with a few key takeaways. First and foremost, whenever you are building an AI application using large language models for a lot of use cases, that actually translates to solving. A data infrastructure, data quality, and data governance problem in the first place. Secondly, we saw that CDPs provide a pretty compelling option as a personalization engine for large language models. And finally, we saw how CDPs can act as long-term memories for large language models by deriving structured information and storing them as user profile and events and making them available for large language models. As and when needed for generating hyper-personalized responses. Thank you all so much for your attention and engagement today. If you'd like to continue this discussion or have any questions on any of the topics that we discussed today, please feel free to reach out to me. Thanks.

Slides

Download slides (PDF)

See all 81 talks at this event!

Conf42 Cloud Native 2025 - Online

March 06 2025 - premiere 5PM GMT

Beyond Pre-training: Scaling Cloud-Native LLMs with Contextual Data for Real-Time AI

Video size:

Abstract

Summary

Transcript

Slides

Ankit Awasthi

Director of Engineering @ Twilio

Join the community!

Featured event

2025

2024

Info

Conf42 Cloud Native 2025 - Online

March 06 2025 - premiere 5PM GMT

Beyond Pre-training: Scaling Cloud-Native LLMs with Contextual Data for Real-Time AI

Video size:

Abstract

Summary

Transcript

Slides

Ankit Awasthi

Director of Engineering @ Twilio

Join the community!