Conf42 Platform Engineering 2025 - Online

- premiere 5PM GMT

Building Bulletproof Database Platforms: Engineering 99.98% Availability in Oracle Cloud

Video size:

Abstract

Database failures cost $9,375/minute. I’ll show you the battle-tested framework that achieved 99.98% uptime across 750+ enterprises—including sub-3-second failovers, predictive monitoring that prevents 83% of outages, and cost cuts of 60%. Real engineering, real results

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.
This is Sunil. Little introduction about me. I'm a seasoned database administrator, worked on different technologies across different multinationals at different geographies. Today I'm going to talk about machine learning power database resilience predictive analytics for 99.98. Oracle Cloud availability. So basically the cost of downtime for any database is permitted nine over $9,000. And financially the impact is huge, the whole year. So by doing machine learning based anatomy detection, you can early detect the issues and take a preventive action and improve the uptime overall. Basically deep learning performance metrics. It helps you, it helps the model. Recognize the pattern first, and then in five second intervals, it keeps judging what is the analysis on the pattern going on. And then what are the invisible sectors available to leverage the neural networks in the whole. Machine system. Then optimizing the Oracle cluster and reinforcement learning. That's how the entire cycle works. Monitor, learn, optimize, validate, keep doing the same during networks for data guard tuning. This is again, for the high availability as well as scalability by using synchronized duplication, minimal performance impact and fast the company. Like within seconds. So ROI is return on investment of machine learning resilience. It's 287% ROI in eight months payback time with 99.98% availability. So that's huge in my understanding. Again, the predictive auto scaling benefits, because like we said, the machine learning, there's a model to learn and optimize accordingly. The overall goal is to reduce the cost by not compromising on the availability of the performance of the entire solution. There are case studies where 99.98 availability is achieved in financial services, healthcare, retail, and basically all the line of businesses across, the different modules, the scalable AI framework, is it like it can't be scaled to any level by horizontal vertical using cross architecture compatibility, automated and all detection add engineer productivity overall by using machine learning. So transforming database resilience. It's discover, implement, optimize, and achieve the goal, which is. Is close to a hundred percent availability, but we, there are a few percentage or less than a percent non unforeseen issues that can occur. Thanks a lot for your time and I hope you could have some idea about the topic we wanted to discuss today. Thank you again.
...

Sunil Yadav

@ University of Pune



Join the community!

Learn for free, join the best tech learning community

Newsletter
$ 0 /mo

Event notifications, weekly newsletter

Access to all content