AI-Driven Rate Limiting for Resilient and Cost-Efficient Cloud API Protection

Video size:

Abstract

Discover how AI-powered rate limiting transforms API security! Learn to dynamically detect malicious traffic, cut false positives by 68%, and reduce infrastructure costs by 27%. Gain a proven, cloud-ready roadmap for resilient, scalable, and cost-efficient API protection.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Hello everybody, I'm Rena. Good morning and hope you're having a wonderful day. And today we are going to talk about AI driven rate limiting for resilient and cost efficient cloud API production. Wow, that's a mouthful, isn't it? And it sounds incredibly technical. Let me rephrase it to simplify it. It's all about keeping the dose to our digital world open for you and other legitimate users and closed to attackers. Because the truth is we all use API every single day, whether we realize it or not. The morning coffee that you grabbed on your way. To office An API processed the payment, not the coffee though. The map that you check for traffic details on your vacation. The API has delivered that MAP data. The Taylor Swift or the adult song that you listen to in your car. API stream the song to your. APIs are the invisible waiters of the tech world. Taking our request and bringing back what we asked for, they are the foundation of modern business and lifestyle convenience. And here's the problem, the more important something becomes. The more valuable it is. Each one of those convenient APIs is a potential for an attacker. And as a reliance on the APIs has exploded in the modern world, so has, it's their appeal to the bad actors. So the question is never about whether your firm is going to get attacked. It's about how well equipped are you as a company. To deal with these kind of attackers when it happens. Are you going to shut down your system or shut out your estimate users? Which brings me to a story about a store on earth, a crowded sales team, a very dumb door. I tell you, imagine you are a store owner. Your door is like an API few years ago. When you started it, you had a simple rule. Five people allowed entry in the store per hour. It was easy, but now your store is the heart of the town. Your door has, your door is just not a door. It's a huge business and everyone wants to come in and shop and probably shop listers as well here. Old rule of five people per hour. It's a nightmare. I tell you on the, whenever there is a sale, as a crowd shows up, your rules is only five allowed at a time, slamming the door on potential customers. This leaves the customers who are frustrated walking and they walk away with potential and possible repeat business. And if you leave the door open. Shoplifters can sneak in and clean you out. U Pac Block real users or let the thieves in. This is exact dilemma of traditional API. Production organizations must choose between aggressive protection that blocks that could block legitimate users or lenient policies that leave systems vulnerable to abuse. This binary approach results in significant revenue losses, degradate user experiences and compromised reliability, but AI and machine learning presence and opportunity to fundamentally reimagine API protection through real time analysis and adaptive mechanisms. Moving on. Let me tell you about my friend Sarah. She runs a travel app and she lived this nightmare. This travel app, let's say, for the ease of understanding, has a limit of five users per hour as well. Practically, it'll be more, but let's keep it five for the ease of understanding. It happens that a famous influencer wrote about this travel app in her block and the, and suddenly there are traffic spikes. Real people excited to book their next trip, come to her app trying to book, and the app slams the. For an hour, there is no revenue. Angry users were probably in thousands because everybody was excited right after the article. This is this tactic threshold problem estimate. Traffic from marketing campaigns doesn't follow predictable patterns. The next day, a bot which is interested to steal data, comes into the app. Oh, from probably 10 different, a hundred different IP addresses and each slowly trying to hack the system and they try to ensure that they, these logged in boards stay under the rule of five per five people per hour rule. Her system sees nothing wrong. The boards get in and scrape all of flight and hotel data. This is a sophisticated attacker problem. They distribute attacks across multiple ips and varying times to stay below the radar or below the detected thresholds. The cost, the false positive problem. When the estimate users are actually blocked, they may not. Retail requests leading to a loss transactions and the damaged customer relationships significantly impacting API driven. Revenue streams. And the other problem is when there is a sophisticated attack by the boats the, this traditional rate limiting cannot identify that there is an attack happening at all in first place. Forget about even mediating it or responding to that. Returning to a store owner. What if we gave him a genius security guard instead of a simple rule? The guard doesn't just count people he watches how they behave is someone just slowly walking in, looking at products, probably fine. He's a group of people certainly shuffling in all wearing similar clothes and bee lining to the most expensive items. Probably a big red flag. That's what AI does through behavioral feature analysis. Instead of just counting requests, it analyzes dozens of attributes simultaneously, like request timing patterns, payload characteristics, geographic distribution, user agent diversity and response code sequences, et cetera. Plus, it adds a temporal dimensions, for example, understanding requests. Sequences, user session patterns and traffic over evolution over time. It knows that the EDL users logs in, looks for something, adds the cart, and probably looks for something else. But what does none of it? It doesn't because it's not buying, it doesn't put in the cart. And this kind of temporal awareness enables detection of slow burn attacks, unsophisticated attacks that might appear benign individually. Talking about how does it work? Think about it like this. You don't you don't hire one security guard. You hire a committee of many experts. One expert could be a geographic visit, one could be a timing guru. One could be all about browsers. One could be about IP addresses. So let's say we just take three of them into concentration and a request comes in, the committee votes. The geographic guy says it's fine, but the timing guru and the browser expert says, nah, dude, that's weird. The majority rules unblock it. This committee is what we call. Decision. Pre ensembles like random forest and gradient boosting, they excel in handling mixed data types and are relatively interpretable and capture complex non-linear relationships. Between features. Of course, we face training challenges, estimate traffic, typically far outnumbers malicious requests. We address this through advanced sampling techniques. And cause sensitive learning. And here's the crucial part. This committee never stops training. It continuously learns from new attacks and changes the traffic and changes traffic patterns through online learning algorithms and periodic. Retraining, retraining strategies periodically retrain strategies to understand what kind of attack may happen in future. This kind of continuous learning capability is what keeps the system smart over time. I. Back to Sarah's travel app. The influencer writes about her app again this time with her AI in her system. Her system sees that there are traffic spike, but it is also sees that requests are coming in from. Different users, different Instagram users. People are browsing from different pages using various devices, and AI thinks this looks like a happy crowd. The door stays wide open. Meanwhile, those sneaky bots come back. The AI sees they are perfectly spaced, have DentiCal, so source of. My source from where they come in analytical software and hit only the login page and they don't do anything much apart from search. They, I think this is suspicious, but instead of slamming the door, it might just slow them down. This is progressive throttling, moving beyond binary decision to implement increasingly. Restrictive measures based on confidence level, subtle delays, rate deduction, and only fully blocking for clearly malicious activity. The AI also uses dynamic user classification, anonymously grouping users in a behavioral cohorts based on historical patterns. Application users, car application, usage characteristics, and risk profiles. Users can move between these kind of segments from based on the real behavior, earning higher privilege through legitimate users or facing restrictions for certain suspicious activities. Okay, how do we build this? The beauty is the cloud gives us the perfect tools. Imagine you need to upgrade security for your store coming back to the store. You, even though we did speak about a smart security guard, but if you think about it, we cannot use a single security guard like Hulk who wouldn't be able to multitask. You need a team, a camera, to monitor a sec, a fast security guard who runs around, an expert consultant at the back end who's monitoring things on camera and an intercom to connect them all to be able to talk. And it need not be just one single security card. It can be multiple, but that's the whole point, having a team. On AWS we leverage this with Full Stack. We leverage this full stack with API, gateway, Lambda, and SageMaker on Azure, the API management with Azure Machine Learning on Google Cloud. It's API gateway with Vertex ai. Now, what if your store becomes so successful that it branches on in different. Countries, different locations. You need to be consistent with your security everywhere. The challenge is making sure that the security protocol in one branch doesn't drift from another. The solution, you don't let branch make its own rules. You have one head of security at headquarters. Who sends the rule to every branch and every branch executes it in with their local team. Multi-cloud strategies provide additional resilience, but con introduce complexity in data synchronization on model consistency. Container orchestration. Platforms like Kubernetes can facilitate deployment consistency across different cloud providers. Now, how does AI actually distinguish less estimate spikes from attacks in real time? Let's get into real magic with three specific attacks on Sarah's travel app. Stopping credential stuffing attackers take millions of stolen passwords and try them on Sarah's login page, AI analyzes the characteristics of these traffic surges, which helps understand the gradual onset of estimate spikes that often contrast sharply with sudden appearance of distributed attacks. It sees the pattern of rapid FA rapid failures. From different IP addresses and blocks, suspicious ranges. The essentials login used by actually such ips are also forced to reset passwords, ensuring safety of the app data and user data as well, preventing API scrapping. So what happens and the competitors uses Spike uses bot to systematically steal Sarah's entire inventory. Yeah. Examines request payload patterns and sequences. Real users actually browse naturally while scrappers execute perfect robotic patterns to find data and don't do anything with the data, like they don't add it to the cart and all. So that is an identifier and AI stops those kind of attacks, catching low and slow account takeover. How does this happen? And a hacker gets a valid account, say, and slowly gathers data over weeks. AI uses behavioral profiling and examines user session behaviors to spot anomalies. Like in this case, is the user logging to just view personal data. And payment details of the account, then probably that's suspicious. AI models examine request payload patterns, response code sequences, and user session behaviors and statistical correlation across sources. These capabilities are essential for detecting distributed attacks, credential attacks and attacks like credential, stuffing API, scrapping and resource exhaustion. Attempts. Let's talk about the best part. The payoff. Remember those boards that was sneaking into Sarah's app? Every one of them was costing her money by stopping them at the door. Companies see 30% lower infrastructure costs because they are not serving junk traffic anymore. It teams get 40% of their time back because they don't have to look at bug things that are not even getting into the system. Plus. 25% improvement in resource efficiency by right sizing based on estimate traffic patterns. This precision of AI driven blocking reduces the need of over pro over provisioning infrastructure to handle potential attack traffic deserting in substantial savings for. High traffic applications. This is a win for security. This is a win for your team. This is a huge win for your budget and company. And needless to say, this kind of safeguards in place for users, data leads to reputation of the firm, which surges the company's evaluation. And now this isn't also, this isn't. Set it and forget it system. Remember, a committee of experts, they never stop learning. They are constantly learning through continuous adapt adaptation. Few things for them to evolve over time. For example, data collection, traffic patterns, user behaviors, attack indicators are continuously gathered. Drift detection monitoring systems. Identify when users' behaviors evolve to understand what is generic user behavior and what is an or anomaly pattern analysis. Online learning algorithms update models as new data arrives to identify new patterns to new patterns. To be aware of the attacks and estimate user behaviors, field integration, security team, investigations, and user reports are incorporated. AB testing, gradual rollout of updates, validates improvements. This ensures protection remains effective as new threats emerged and legitimate usage patterns shift over time. The system that protected Sarah's app last year. Evolves to handle newer attacks. If it happens tomorrow, how do we know it's working? We need to both tech. We need both technical metrics like detection, accuracy, performance resource utilization, and adaptation rates, and business impact metrics. Remember Sarah's story. The real success wasn't just blocking the attacks, but the revenue impact of blocking estimate users and improved customer satisfaction, fewer support tickets and reduced infrastructure costs. Effective evaluation necessities holistic view, combining security outcomes with tangible business impact. Relying solely on security metrics provides an incomplete picture of the system. Systems, true performance and value. Now let's make this concrete with real world case study that could, that could be Sarah's success story. A major e-commerce platform was experiencing credential stuffing attacks. During peak shopping season, traditional rate limiting blocked estimate shoppers resulting in loss sales losses ti estimated to be 2 million annually. This, they deployed this because of this. Later on, they deployed AI driven rate limiting. With progressive throttling and user segmentation, the system analyzed 50 plus behavioral indicators to distinguish between estimate shoppers. And attackers, the results are staggering. False positive rate reduced by 92%, successfully blocked sophisticated distributed attacks while hand handling 300% traffic spikes during flash sales. Infrastructure cost reduced by 25% through elimination of attack. Attack traffic and this ingest theory. This is proven strategy with dramatic results. Looking ahead. We see emerging technologies like deep learning for more accurate pattern recognition, reinforcement learning for adaptive response strategies, and automated feature engineering, reducing expert requirements, reducing expertise requirements. Organizations that invest in building the necessary capabilities will gain strategic advantages. Superior security. PO Posture, enhanced reliability and user experience reduced operational overhead and competitive differentiation to better a PF protection. Like I told earlier, if your protecting your data and user's data, your reputation increases and so does your company's evaluation. And as a digital economy. Continues to expand those with most effective API protection strategies will be best positioned to capture the opportunities of increasingly corrected world. So what's the story? It's about moving from rigid, dumb rule that hurts your business. To a smart adapter system that protects it. It's about using the cloud to build a team of AI services that work together as a genius security guard. It's about keeping your doors open for businesses, for business, not businesses. Of course, it could be businesses as well for everyone who brings their. And firmly closed for those who don't. The future is intelligent, adaptive, and efficient, and it's available today. So let's make use of it in your respective systems and companies wherever you are working. I would like to thank all of you for this opportunity and sticking with me to understand this topic today. Thank you.

Slides

Download slides (PDF)

See all 22 talks at this event!

Conf42 Incident Management 2025 - Online

October 02 2025 - premiere 5PM GMT

AI-Driven Rate Limiting for Resilient and Cost-Efficient Cloud API Protection

Video size:

Abstract

Summary

Transcript

Slides

Rehana Sultana Khan

Software Engineer @ Versa Networks

Join the community!

Featured event

2026

2025

Info

Conf42 Incident Management 2025 - Online

October 02 2025 - premiere 5PM GMT

AI-Driven Rate Limiting for Resilient and Cost-Efficient Cloud API Protection

Video size:

Abstract

Summary

Transcript

Slides

Rehana Sultana Khan

Software Engineer @ Versa Networks

Join the community!