Real-Time ML in Motion: Architecting Sub-Second Analytics Pipelines for Predictive Business Value

Video size:

Abstract

Discover how sub-second streaming analytics delivers 35% faster insights and 42% better efficiency. Learn how financial firms cut fraud losses by $15M, e-commerce boosted conversions 18%, and manufacturers reduced downtime 31%. Transform your data strategy from reactive to revolutionary.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Hi everyone. Welcome to the Con 42 Machine Learning Conference. Hope you're having a good day. My name is S Kati and I've actually been working as a senior software engineer at Meta for about a year now. Before that I used to work at Google for about four years and prior to that, Walmart Labs and prior to that in a startup for about another five years. Basically, today we are actually going to talk about something that I've actually been working on for quite some time. And I think it'll actually provide you with great amounts of insights on how to make your decision when it comes to building analytics pipelines. On that note, we are basically going to do a deep dive into the fascinating world of real time machine learning. So we'll explore how to build analytics pipelines that delivers insights in less than a second with the enormous amount of volumes volumes of data generated today which is around 2.5 quintilian bytes each day, which is not an normal amount. It's quite enormous, right? Traditional methods simply aren't keeping up. We'll unpack why speed matters, look at the challenges involved, and discuss how industries are gaining major advantages from realtime ml. Now let's actually talk about the data explosion challenge. First, let's understand the sheer scale of the data problem we face every day. Like literally as we said, we produce about 2.5 quintillion bytes of data, which is even hard to visualize. So to put it into perspective, right? If we were to stack blue, red disks containing this data, they'd reach all the way to the moon. Again, like this is it's even impossible to visualize. Around 75% of the companies now rely on machine learning applications that need immediate responses. So for these businesses, waiting isn't an option. I. Real-time analytics provides insights 35 past percent faster and boosts efficiency by almost 42%. Imagine how crucial the speed is for decision making during live events, medical diagnostic, and final financial transactions. Of course, like you will see many businesses, which still do rely on batch processing. We'll of course come to that what batch processing is versus what streaming pre stream processing is. But the thing that I want you to keep in perspective is. Every business has its own needs for some businesses. It's very important for you to make sure that you actually process real time data. For example, financial transac financial transactions, right? Real time insights are the ones which matter. Like for example, let's say. Basically if if a fraud is being detected, which is let's say a fraud occurred like, let's say two hours ago, but you're detecting right now it wouldn't make much of a sense again. We'll put things into perspective as and when we go through the slides, but I just wanted to visualize how businesses want to process data. And of course, like it differs from business to business now coming to batch versus streaming processing. So I would want to you, I want, I would want you to take a moment to realize what batch processing is versus what stream processing is. Batch processing usually is typically takes hours or days. Which is basically for monthly or weekly reports. For example, let's say you are running an ad tech business and you would want your business people to understand more insights into how your ads are performing. I. Over a day for the past week, for the past month, so on and so forth, you would rather want to stick with your batch processing pipelines because which run like, let's say every hour, every day, so on and so forth, which would definitely fit your needs. That's totally fine. You would not want to go for very high-end like engineering heavy stream processing pipelines. But on the other note, think about what I have been discussing just now. Think about fraud detection. What if your bank took hours to identify suspicious activities? It's not at all good. So now coming to that, there is an intermediate approach called micro batching. Which basically cuts the latency to milliseconds, of course, like basically, which is which in my opinion is really good. For example, micros Micro batching has actually been serving like near realtime needs for quite some time now. But realtime streaming actually goes a step further, like as in. How do you cut down that latency to under 10 milliseconds? Like for example I agree there is, there are instances where you would want to go for micro batching, but you would definitely want to understand where your business fits, right? So let's say you are actually in your amazon amazon.com and you are buying like a couple of things, like for that matter, let's say diapers for your kids or whatever. Basically. It takes a couple of minutes for you to actually go from the homepage to the page where the product is, to the add to cart page, and then again the checkout page. So now let's say that you are an engineer at Amazon and you want to identify all the people who are buying diapers, let's say in the, who are buying diapers. So you got an event into your into Amazon saying that, Hey, somebody bought diapers, right? You would want to identify what what items were they looking at prior to that? So as in what is a really, what is the path of the user starting from the homepage, which is really important, right? Otherwise, where would you want to put ads? How would you want to categorize things which are bought together usually? So for such sort of things, you would want to analyze the path of the user. To do that, you really want to. Use micro batching because then you would want to, let's say check, okay, this time the buying occurred, let's actually go back a couple of windows, which is a couple of milliseconds, a couple of seconds or a couple of minutes back to identify what really went on. Such sort of things is where microb batching is really like really used and it actually is really good too. But there are obviously some businesses which definitely want to take it a step further to real time streaming. As I said, which cuts down the latency to under 10 milliseconds. Now, this near instant processing is essential in scenarios like let's say livestock trading, realtime gaming, dynamic online content, so on and so forth. As you can see, many of the FinTech startups have come up and many the FinTech scenarios have. Like literally exploded after the ai boom. You can see livestock trading, right? That's an excellent example where real time streaming is much more important. You would definitely want to get real time data, as in how the stock is performing or as in how the options are being sold but so on and so forth to make an informed decision about what to do next. So that's where real time streaming is what is required. So I hope you might, you could have actually at least, form the mental picture of where batch micro batching and stream processing sit in the entire analytics pipeline scenario. But of course, like in this particular thing, we would definitely want to as part of this talk, I would definitely want to dig deep into streaming analytics per se. But again, of course, like batch I think is easily understood. Micro batching is a little harder to understand. But yeah, feel free to stick with me and I promise you'll get better insight. So now starting with, let's say financial services, right? Oh, sorry. I think I might have. Yeah, my bad. So now starting with financial services, real time fraud detection is much more important. Of course, as and when as and when the internet exploded and frankly, as, and when the AI is currently taking over the world, you can see many and many more new type of frauds and scams popping up. So it is much more important to do real time fraud detection. So as to make sure that they are, the banking is secure, right? So banks started adopting the real-time fraud detection, which have reduced their losses by almost about 27% for, let's say, considering a mid-sized bank that translates to around 15 million savings in each year, which is not a normal amount again. So traditional fraud detection mechanisms analyze data. After the event has occurred, like as we were just discussing, which is often too late because of course the event has already occurred, the transaction has already occurred, and as such, it is now near impossible to reverse that particular event. But near, but the real time systems that we are talking about actually flag the unusual activities as in when they're occurring. So you can actually stop the fraud activity before the event completes, which is what we, which is what is of utmost importance. For example, let's say detecting multiple purchases in distant locations, right? I've had the situation where my credit card, got hacked again, like I, I assume many people might have actually gone through the same scenario and like, why? While I myself live in California, I. I've actually had some business transaction go off in, let's say, Tennessee, Florida which again, like from from from tracking my, let's say my recent transactions, let's say, which occurred literally one hour ago, happened in California, but the next transaction, which is currently occurring now is happening in Tennessee, which is quite impossible. So basically flagging that particular transaction as fraud actually would save me a lot of money. Again, like these sort of scenarios are what are important. So now how do we like detecting multiple purchases in, let's say, distant location simultaneously? Some, something traditional batch methods would completely miss, right? Let's say for example, if you are taking a batch processing approach to this particular problem, what would really happen is okay in this particular hour a transaction happened in California. In the next particular hour a transaction happened in Tennessee. But of course, like you wouldn't, you would not necessarily get the entire picture, let's say, okay, for on an hourly basis if we check it. Of course, even batch process would pick it up, but the transaction might have already occurred. Like the person who would've actually hack my credit card might have already made like thousands of dollars of purchases, and I am the person who is responsible to pay for it. Which doesn't really make sense. So basically such realtime fraud detection mechanisms are much more important and they are becoming more important day by day. So now coming to let's say another particular use case, right? Which is the e-commerce, for example, as I was talking about like with the advent of at least I myself use Amazon and Walmart quite frequently. So what I have actually observed is the personalization. For the customers in these e-commerce businesses have actually gotten way better than what they used to be. For example, they have as I was saying, e-commerce businesses have actually seen major benefits. Some real time personalization can boost sales up to like by 18%. So for example, picture yourself doing this right. As you are browsing, let's say our real time analytics immediately updates product recommendations based on your let's say current session, inventory status, pricing strategies, so on and so forth. So this dynamic approach will definitely increase the average order values by 12%. For example, let's say as I was alluding to my previous example, like I have a kid and I of course, the thing that usually pops up into my mind is buy diapers, screams on, and so forth. So if you were to take the example of buying diapers themselves, right? So of course like running on that less sleep more for quite some time now with a kid. Basically you sometimes forget, or sometimes you might not even realize what. Is really needed for you to purchase something. So for example, let's say as I was browsing diapers, if let the real time personalization kicks in, like the analytics pipelines kicks in and they say that, Hey, I have seen that you have been buying diapers now and it's been some time since you bought, let's say diaper cream or let's say moisturizing lotion for your kit. Why don't you actually add that to the cart? Or of course, like this is, I think not. Prompting per se, but basically people have also bought moisturizing cream, so on and so forth. That would actually trigger something in my brain saying that Hey, I forgot to buy this. Let me actually add this. So this sort of personalization is something that I would love in any new product, right? Because it is important for the tech to make our lives better. So now along with that, let's say realtime systems prevent customers from facing out of stock items. Like I've always had this happen, right? Like where, let's say I have seen that you add a couple of items in Target, a target shopping cart, or Walmart shopping cart for that matter. By the time you are just about to check out, or let's say you are about to get it delivered, it just says out of stock. Such sort of events are actually problematic for the user experience and handling such events in real time is much more important. I. Again, you see these are much more important when they, when there are like peak shopping like ho peak shopping events like holidays or major sales. Like for example, when I was working at Walmart Thanksgiving is a major season, right? So you would want to buy gifts for your loved ones. Want to make sure your family's happy. I've actually seen this happen many times where there is a particular sale which pops up. Let's say you want to fi, you want to buy an iPad for your family members. Okay? There is a suddenly, there's suddenly a sale which popped up in, let's say Walmart. That 20% off of iPad. It's not a normal, it's not a normal sale, right? So you. Basically many people along with you are actually flocking to walmart.com to buy that particular thing, and as such, it would definitely make much more sense to update them real time on what is actually going on with that sale event. As in, are you a little too late? Are you are, let's say they're like, there are, how many of the iPads are actually still left, so that way you can provide customers with way better shopping experience? Now let's actually discuss a completely different side of the coin, which is manufacturing for that matter. Factories right have actually dramatically improved operations by shifting to realtime predictive maintenance. For example, machines are now equipped with advanced sensors which can stream continuously stream data. Like what is the temperature, what is a vibration? What is the operating speed of the machine? So now real time machine learning models. Analyze this data instantly, and they can actually predict equipment failures before they happen for ex like I, you may ask the question, right? Like, why is this important? So manufacturing is not something which you, which you regularly see on a day-to-day basis. So why is this really important? So regular maintenance of, regular maintenance of machines is important so as to streamline the operations and not expect any latency. For example, let's say again an out of the mind example, I'm just thinking out loud, where let's say you have ordered some toys and basically for delivery and let's say that has gone back to the manufacturing plant for manufacturing. Again, this is a completely out of the world scenario that I'm talking about. But let's say that, okay, the tie manufacturing has started, but somewhere down the line the machines failed. And that the tie, which is supposed to be delivered for your kid's birthday, has now been, the delivery date has now been pushed back let's say four to five weeks. That is not acceptable, right? There is a particular reason why you pick that tie, and there is a particular timeline that you have in your mind. But now that because of a manufacturing delay, all of this happened. So equipping machines with such data where it can actually stream the data. Like saying that Hey, my current, the temperature of the machine is so and let's say that like the pressure which is going through and that number of ties that has been processed are built until now or so and so such sort of numbers would help us quickly understand, hey. We have seen that let's say previous day we have seen that this machine is processing so many ties or so building so many toys per minute. But now that has gone down drastically. What is really going on, right? So you are able to identify the issue with the manufacturing or the machine way before something really bad happens. And as such, you are able to fix it so that your operations are way more streamlined. As and when more and more. Machines are more and more automation kicks in. This is much more important. So now let's actually talk about the technical challenges, which are event time processing. So for example, there are again, implementing this real time machine learning is not without its own challenges. For example, a significant hurdle is event time processing, right? Accurately handling data based on when events actually happen. But not when they are recorded, like as in there real. There is it's hard to visualize. So let's take a moment to understand what is going on. So there is a slight delay on when the event actually occurs and when it is recorded, as in, let's say you clicked on add to card button inlet on let on amazon.com. So that event is being sent to the backend server for processing. So the, let's say you click the add to cart right now, but because of a server delay or something, the backend processing system actually received that event a couple of minutes late. So that does not necessarily mean add to cart never happened. It means that the add to cart event is actually recorded late. So that is the distinction, like as in when the event occurred versus when it was actually processed. So around 15% of streaming data arrives out of order because there are. There are a lot of issues, right? There are, these are all computers where there events are being recorded. So let's say there is a network problem, there is a machine, which machine which went down, so on and so forth, which actually makes the streaming data this is a very common occurrence where they are out of order. Which complicates accurate analytics. Again, this is this is something that this is a, which has been a bigger challenge. So now to overcome this, right? Basically methods like sliding windows or watermarking has been has been used to ensure data accuracy, like as in let's take a moment to understand what these are. Sliding window is basically what you're saying is, again, going back to the e-commerce example, you saw that. An add to cart event is is, has come to you, okay? Like you are a software engineer, and as part of the data that an add to cart event occur. You see that, but weirdly enough, let's say you, you don't see an event where the user landed on the homepage. That doesn't make sense. As in how would you go to add to cart before going to the homepage, right? Like you go to the homepage, you browse the product, you then add to cart. So there are like three events, but somehow because of some server delay or whatever, add to cart is the one which you got first. So what you say is hey, I'll actually wait for two minutes or one minute. Where for the home homepage event and the product page event, so let's say that is called, that is where you ensure how you ensure data accuracy. Okay, now that, yeah, you have waited two minutes, let's say you have gotten homepage and the product page, you bundle this together saying that, okay, like now I have gotten the full data, and as such I have, I can order those events properly. Homepage occurred, first product page occurred next, then the add to cart page happened. So you can see the user path or the user behavior. As it is needed. And again, if you were to take another example of stock market trading, precise timing of transactions is critical again, which is really important, right? Like you cannot just say let's say I want to buy a stock right now, but suddenly for whatever reason, I. Like the event got delayed and the stock price rose, you cannot just tell the customer that unfortunately, we could not buy the stock right now. And as such that incorrect ordering can cost significant financial impact. Such sort of events are much more kept in mind. And as such, of course, proper methods have been devised, like siding, windows, and watermarking. Again, just to reiterate now, going over, let's say we want to understand the model drift in continuous systems, which is basically, which is, this is another one major challenge. Like for example over time data patterns naturally change. So this causes the models to become less accurate because they have been trained on a particular set of data on our particular data patterns, but naturally the data pattern changed. You usually have to update the models very frequently, as in the training has to occur very frequently. So as to make sure that the models actually keep up with the data pattern trend. So if you were to take real Ty retail businesses right? Customer shopping behaviors change seasonally. For example if you were to take any clothing business of sort summer clothes and winter clothes are something which are very common, and as such, the buying patterns of summer clothes increases over summertime decreases as the winter approaches. So is the case with winter clothes where it increases over the wintertime and decreases as the spring approaches. So now considering such seasonal behaviors, right? Basically, which influence the products, what products they usually buy. Real time systems should address this by detecting changes in data distribution and automatically retraining the model. So that such retraining of models is much more important to make sure that you keep up with the data trend. Again, as we were just discussing, think of it as keeping the system fresh and accurate. Automatically adjusting to shifts without manual intervention, right? Like of course there will always be some data changes, which humans might not have actually let's say, been thinking about, for example I've actually seen a couple of things where I. There are a couple of scenarios where, there are a couple of scenarios where the flowers buying actually increased by quite a lot and but as in when some of the, some other events occur, it did quite decrease and so on and so forth. Such sort of things are very. Almost impossible to predict, right? As a human, you cannot deal with every single data pattern, but basically you would want your models to automatically take care of it. Let's say. Now taking, talking about the architectural comparison, so choosing the right streaming framework is critical. For example, until now, we have established why stream processing is important, how we have overcome the challenges, so on and so forth. Now let's actually discuss about what technologies are actually present. And what frameworks essentially that's the right word, but what streaming frameworks are actually present and what are the ever so slight differences between them and what, how do you compare them against each other? I. So for example Apache Kafka, right? It can actually process about 2 million events per second with around 10 milliseconds latency. Apache Flink is actually even faster reaching sub millisecond latency ideal for financial markets or real time giving cloud services like AWS Kinesys and Azure event hubs provide simpler integration. But to have slightly higher latency. So now business often adapt hybrid approaches basically to balance performance, complexity, and convenience. For instance, using Kafka for event ingestion and flink for processing might be ideal for complex environments. Again, you might ask the question, why not actually go for the best one, right? So there are a lot of reasons. So when it comes to big companies or when it comes to your use case it depends on what sort of thing you're looking for. It's just not always ideal to go for the best one. Again it's not, I'm not trying to say. Increasing the latency is what you want to go for, but basically there are ever so many things that you can, you have to think about, keep in mind as in how many teams are actually looking for this data? What is the latency that is tolerated? What is the latency that you really want to shoot for? So on and so forth. Again, like some things are easy to integrate, some things are very hard to go for, go with, go to integrate, and sometimes the learning curve is way too much, so on and so forth. There are a lot of decisions. Our thought processes, which go into selecting a particular streaming framework. But at least in the span of my career, I've actually always worked with the hybrid systems. Again, like as I was just saying, Kafka for event ingestion is something which is very widely used because it actually. Elev for seamless event ingestion and flink for processing is something very widely used as well. Again, there is park streaming, there is park structured streaming, so on and so forth, which provides similar functionalities, but it really depends on. What functionalities you are you looking for? For for instance some frameworks provide better watermarking, some frameworks provide better latency. So if, let's say you require better watermarking, you go with the necessary framework I'm just trying to throw all the details at you so that you can make an informed decision and you can actually learn what, how to make a proper, informed decision and choosing a particular streaming framework. Now implementation. Let's talk about the implementation, right? So basically how does the implementation work when you are trying to implement a real-time architecture? So first the data enters through high speed brokers like Kafka, for example. Kafka is used for so that, let's say for server fails, and you somehow. Don't process the event downstream. You will you have enough time for you to process and which hand Kafka like handling millions of events per second. Next processor like Flink manages complex computations and realtime analysis. Then a real-time feature store quickly delivers essential data to models, because that is important to create features, so on and so forth. And finally, optimized models. Serving infrastructure produces predictions and constantly monitor system accuracy. So this setup includes automatic retraining. Where accuracy drops, ensuring consistent performance and reliability. Now let's actually talk about the key takeaways and next things, next steps. So to wrap things up again, implementing real-time machine learning has clear substantial benefits. I. Dramatically reducing fraud for an for instance, significantly improving sales in e-commerce, minimizing downtime in manufacturing. The keys to now keys to key to success in is include is to choose your architecture very carefully. Continuously monitor performance and automating model retraining. Now I would say begin by targeting your most impactful latency sensitive applications. From there incrementally build your capabilities and be on lookout for changes in your data environment. Again, thank you so much for your attention. I hope you learned a great deal about streaming data structure, streaming frameworks, so on and so forth, and how, where they are used, and I'm really happy that you are. Again, thanks for attending the conference. Hope you have a really good time.

Slides

Download slides (PDF)

See all 137 talks at this event!

Conf42 Machine Learning 2025 - Online

May 08 2025 - premiere 5PM GMT

Real-Time ML in Motion: Architecting Sub-Second Analytics Pipelines for Predictive Business Value

Video size:

Abstract

Summary

Transcript

Slides

Sai Kaushik Ponnekanti

Software Engineer @ Meta

Join the community!

Featured event

2026

2025

Info

Conf42 Machine Learning 2025 - Online

May 08 2025 - premiere 5PM GMT

Real-Time ML in Motion: Architecting Sub-Second Analytics Pipelines for Predictive Business Value

Video size:

Abstract

Summary

Transcript

Slides

Sai Kaushik Ponnekanti

Software Engineer @ Meta

Join the community!