Building Scalable Data and AI Solutions with Cloud-Native Architecture

Video size:

Abstract

Introduction to Cloud-Native: Explain what cloud-native means and how it differs from traditional architectures. Benefits for Data and AI: Discuss how cloud-native technologies enable flexibility, scalability, and automation for data processing, machine learning models, and AI workflows.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Hello, everyone. My name is Mithun Panda and I specialize in technology, data and AI helping Fortune 500 companies solve complex business and technical challenges. Today I'll be talking about how to build scalable AI and data solutions using cloud native architecture. Now, imagine this, AI systems on that scale effortlessly, large data pipelines that adapt in real time and businesses that continuously innovate faster than ever and without worrying, about infrastructure management over it. And that's the really true power of cloud native technology. So let's dive in and explore how it is transforming the way we AI. Now, when you look at the trends, what is happening in the industry, we are always obsessed about value creations. So when we think of digital or technology or AI transformations, and we always focus on business value creations. Now, cloud adoption has definitely helped organizations accelerate on the business value side of it. However, to make this happen, we must get a few things right? The foundation has to be correct. for example, there are certain six key elements that I have mentioned here, starting from building foundation for analytics. you foundational platform, which will enable your use cases, which will help you accelerate use cases. The second element is the cloud native architecture, which is very important and the cloud native architectures. And we are moving away from on premise to cloud native, which will help us and achieve the scaling up or down based on the demand, achieving the fault tolerance side of it with high availability and self healing mechanism, right? The third is a rapid value creations. Now to achieve the rapid value creations, we have to. understand the cost side of it as well, and this is financial operation side of ops side of it, or the cost s how we can make sure that, and we can process the large data and AI workloads, while, making sure that we are cost efficient, right? And then scaling of the resources seamlessly. that is also very important because, we, we'll be, deploying a lot of AI use cases, we'll be building a lot of models and AI models and this is really necessary that no, we can scale up the resources seamlessly. As because we are data driven, we are in a data driven journey and making sure that we have the data is governed properly, it is accessible, it's reliable, and it's available. That is really important. These are the three key data enablers we must have. We must have to get this right. And then finally, future proof digital advantage, right? We are moving away from on prem to cloud and now from cloud to multi cloud. Again, I'm not saying that on premise will die here. But definitely we will Live in a world, where we'll have, architecture and aware, which will support very diverse kind of infrastructure. So we have multi cloud, you have SAS based applications when you have on prem applications going forward to understand bit on the cloud native side of it is very foundational on the what cloud native means, which is to design, which is to build and which is to run applications specifically optimized for cloud applications. The cloud environments, right? So instead of traditional on premise infrastructures, and we, in cloud native applications is leverage the true power off the cloud capabilities such as scalability, resilience, automations and flexibility, so on and so forth, right? So when you look at the scalability, what exactly scalability means is this is where the cloud has the ability to adjust resources based on demand, right? for example, Netflix, it scales is infrastructure. During peak hours, and scales down during peak hours. it's very intelligent and this is how they scale, their recommendation systems and their streaming system as well. When you look at the resilience, what is very simple, right? If something happens, some failure happens and make sure that your system is so much resilient that It can quickly recover from failures and the automations really, we cannot separate automations without DevOps or CI ICD and part of it is the infrastructure management, which is infrastructure as a code, such as, infrastructure, which is very important to automate deployments. and finally the flexibility and flexibility is really super, super important here because we are. need to we need to provide to make sure that okay, this is interoperability across various cloud providers as because we are going to the multi cloud journey. now all if I sum it up on the cloud native architecture side, it leverages the microservices, the container, serverless computing, the DevOps CI CD and orchestration tools to make sure that it helps us achieve the lower cost or achieve cost efficiency, innovate faster and optimized best. Thank you. AI performance. Now to just to deep, deep dive into just more and just to take a thousand foot view on each of these elements, which I just talked about what exactly mean microservices. Microservices are nothing but applications which are broken down into smaller, independent and modular services that communicate by APIs. This is very straightforward. Microservices. Then we have the containers. What are containers? Is nothing but a lightweight or which is portable, right? Environments for running applications seamlessly across different cloud setups and then container orchestration such as Kubernetes, which is very popular to manage, scale and deploy. and containerized applications or automate containerized applications, server serverless computing. And this is where we do not need to manage the cloud infrastructure, such as AWS Lamb Lambda or NGCP. You'll have Google Cloud functions in Azure. You have Azure functions. These are the kind of example of serverless computing. And this is really. it helps us without really looking or scratching our head on managing the cloud infrastructure side of it. And then DevOps and CICD, I already mentioned the CICD side of it, which is DevOps side of it for the continuous integrations and continuous deployment pipelines for faster deployment and innovations. Now look, now we are building a lot of generative AI models as well. So, CI, CD has now been extended to CI, CD, CE. So what that means is we are also continuously evaluating the models, right? so that's why I know this is CI, CD and CE. So just continuous integrations, continuous, development and continuous, evaluations. Moving on. again, this is a deep dive on each of these components that I mentioned, looking at how we, how the microservice and the container principles help us enable the scalability and resilience. one is the quick, three quick things that I would like to point out here. Han look at the data and AI solutions and to scaling operate is, independent scaling look, so scale scaling of AI inference separately from data ingestions, right? When you build AI models, basically you have. to separate the front end side of it, the back end side of it, the data ingestion side of it, the feature engineering side of it, your AI model development side of it, which is again feature engineering and the model and the evaluation side of it, yeah, inference piece of it, right? and this is where the containerization is going to help you. And fault isolation, so if one service goes down, then making sure that another is operational. So it helps us build the resilience side of it. And then the faster deployments, making sure that, you are creating a modular, your architecture is modular. and. And which is really, which is going to help you in deployment, deploy faster. Now on the right hand side, I've just given one example here. There's tons of examples in the industry that we have seen. In fact, in, in your organizations, you might already be doing that, right? Leveraging microservices and Kubernetes based architecture. How it really help us scale, in, in building our AI applications or the data. Driven applications or data centric applications. So for example, Spotify uses microservices in the Kubernetes to scale its AI powered music recommendation engine. Netflix is another great example, right? You know where the deployment happens, every 10 seconds, I would say or 11 seconds. And they leverage Microsoft Kubernetes modular and, Amazon is another one, which is the early adopter, right? I really don't need to talk more about containerizations, but looking into our AI specific one basically help. It helps us the consistency, the managing consistency across environments. and then when you look at the Kubernetes, which is the container orchestration platform, which has helps us scaling, and the orchestration and the selfing of AI workloads, moving on. The serverless computing. Now serverless computing is another really important piece when we manage our data and AI workloads. It helps us execute a code without managing infrastructure, right? Now in, in generative AI solutions and when we build generative AI model or large language model, applications, GPU as a service or inferences are two really critical things here. And this is where You know, making sure that our infrastructure has the capacity, our infrastructure is efficient enough to make sure that we can build a model, we can deploy the model and we can achieve the low latency through the inference. And this is where the serverless GPUs are very important now. Inference as a service, you can leverage, definitely leverage GPU, but also you have LPUs such as Grok, which provides a faster inference as a service. now. What are the benefits that we achieve out of, leveraging serverless computing? One is the auto scaling, right? it allocates dynamically, the resource based on the usage, and then cost efficiency is again, pay only for execution time and a faster deployment really. And you are not, it is eliminating the overhead of managing the infrastructure. So this is, these are the kind of three key benefits that there are so many, but for a developer and for us, senior executive or the decision makers, these are the tangible benefits that we. immediately seen once we start leveraging serverless computing. what are the common use cases? And definitely there are tons of use cases, but, in, in, in current scenario and AI powered chatbots can be, is one of the use cases, AI model inference, real time data processing using serverless ETL pipelines, extract transfer and load or extract load and transfer, whatever you call it. these are the kind of use cases, which help us, managing the data and AI workflows with on demand resources. So moving on now, when we look at, the managing data and AI solutions are scalable, making sure that it's scalable, we cannot ignore the storage side of it and the data management side of it, as well as MLOps, which is the extension of DevOps side of things. Now, when you look at the storage setup, we definitely data lake is there, which we leverage in the S3 or in AWS world or Azure Data Lake Storage Gen2 or Delta Lake, data warehouse, such as we have BigQuery, Snowflake, Synapse, there are a lot of things. And in generative, we have, we cannot separate vector database such as VVH. in Pinecone, there are so many which helps us store the embeddings for the generative AI applications, right? And the data processing tools, again, there are so many data processing tools and Spark has been very popular in Google. GCP, Dataflow, Databricks is one of the, one of the best in the market in terms of the adoption. now in terms of scalability strategy, how should. What should be our scalability strategy? Definitely, we can use the databases, for the realtime IU workloads or tier storage, and the lifecycle policies must be there. And again, it should be surrounded with the data governance principles and the best data management best practices. Now, on the right hand side, if you look at an ML ops, to automate and scale AI workflows, really if you wanna innovate faster. If you have to achieve the faster time to market, MLOps is very important. and this integrates basically the DevOps best practices, into AI model development, deployment, and monitoring. And now what are some key components? So definitely when you look at the machine learning pipeline, it starts from, building your feature engineering. storing into the feature stores, building the model, creating the model versioning, and then deploying and then monitoring and the bias detections, right? to make sure that now it is prevented from, it prevents drift, right? One of the examples could be, okay, if you look at the investment banking side of the retail banking, A lot of experience on the banking side of it. It is just MLOps to continuously update fraud detections model, is one of the example. Risk management is another example. Hyper personalization is another example, and this is where MLOps has been really beneficial, managing large scale AI workloads, and achieving the automations and the scale. Moving on, how should we think of the security and compliance? Because we really can separate security and compliance. And considering the generative AI adoption that is happening in the industry and security and compliance remain really too critical in our blogs. and I'll just want to make it very high level here. And there are four parts when you think of the security and compliance. One is the privacy side of it, data privacy. The second is the encryptions. And the third is GTA, which is zero trust architecture based, which is super important. And then API vulnerabilities. In terms of the data privacy, we have to follow certain techniques, such as an implementation, implementing, differential privacy. to prevent a data leak, such a data leak is, and then data masking, the anonymizations or the pseudonymizations, making sure that the PII data is not exposed, making sure that you are compliant with GDPR, HIPAA compliance for the healthcare or the CCPA, which is California Consumer Protection Act. There are so many other regulations and are different based on the jurisdictions or the country, countrywide regulations that you are in. When you look at the encryption, so definitely, Implementing a hardware security modules, HSMs for key management, for example, if you want to look at in a certain tools or the services that you want to leverage on to build your generative AI solutions, making sure that it is a hardware security model, HSM compliant. That is very important. Otherwise, your security office or the compliance team, they might not allow to use the service of the tools that you'll be using. Zero trust architecture. Really, this is very simple here and making sure that. you have, you enforce least privilege access, role based access control is super important here, continuous authentications with, behavioral analytics, and some sort of secure enclave, enclave computing as well, and not besides this, you also have the MFA, and that is also very important, OAuth2, for the API vulnerabilities, so anyway, how to secure the AI models, Endpoint security, which is using an O2 or the you, JSON web token, J JT JW T or mutual TLS or SSL 2.0. We have application firewall. Again, this is very important, to, make sure that it helps us secure API gateways, to protect AI services, the endpoints and it's really important to regularly, Conduct penetration testing, and your architecture must be privacy and security fast. This is really very important, when we, are in the journey of building AI solutions in a cloud native, architecture presence. Now, how can we optimize the performance, right? Making sure that now we are getting the right speed. We are getting the right efficiency. We are getting the right latency in terms of the inference. So I've just mentioned four things here. One is the infrastructure. You must ensure that your infrastructure is sized properly, right? So so that means and how to design an auto scale AI workloads based on demand, right? and then next is the model optimization So what are some techniques that we can think to improve model training and the inference speed, right? And maybe you know quantizations is one of the approaches to reduce model size of the memory and while maintaining accuracy, right the data locality can the data be stored and processed in the same cloud region? And then What are some frameworks that we can really adopt for the AI accelerations? not only in terms of the development, but also in terms of the inference set up it. So for example, NVIDIA tracked on hogging face optimum for LLM performance tuning or Grok, which uses LPU language processing unit for really faster inference. Now I have developed one architecture, which is very naive, rag based architecture. Leveraging how we can implement rag based architecture, leveraging cloud native capabilities, just to spend, 10 seconds of the quick, 20, 20, 30 seconds on the rag, what exactly this means. Look, you have lot of data on your, in your organizations, and if you take any, existing LLM or pre model, it's not in your data, right? it is trained up to a certain point in time. Now, how to ensure that we achieve accuracy, and how we can make sure that, okay, this is, we are getting some expected results, based on the correct information. So the real time information, and this is where the RAG is very much popular because we really cannot fine tune all the time. So you have the unstructured reference data, as you see a lot of PDF files, word or text files. You chunk it, and then you convert it into embedding, leveraging embedding APIs. You have tons of embedding APIs, OpenAI embeddings or Hoggingfix embedding. So there are a lot of embeddings. And you can just take a look, the leaderboard, and just pick one and just move on. And then you load into vector database such as Pinecone or Weaviate or you can also leverage Redis in Azure. there's this another tool as well. I forgot, but there are a lot of vector database, so you really don't need to worry on that. So this is the step one. The step two is the user interface side of it. And this is where multiple users will be sending their queries. And make sure that you have the load balancer, such as API gateways or Nginx load balancer. It comes to the APIs, basically in a search into the vector database, right? The embedding API must be the same as the embedding API which was used earlier, right? And then it will retrieve the top key. So in the top three or top five based on your configurations, and then the query and the top and the retrieve documents and are passed to LLM to get you get the answer. Right now, how do how does it translate to the cloud native capital? Again, this is the right nag. It is not advanced rag. But when you look at the load balancer side of it, which is to ensure that even distributions of request, right? It's very important. And we can also use a load balancer on the LLM side of it as well, which I have not shown in this diagram in this architecture. But. A load balancer is really critical. Number one, each of the front end side of it, the back end, the vector database, the prompt template, the LLM services are docker containerized, and this is really important because we don't want to create a monolith systems and each component is containerized. Now we can also leverage container orchestrations like Kubernetes to manage each of these containers that I just mentioned with each HPA, horizontal pods, autoscaler capabilities, capability to ensure services scale up and down, based on demand, which is really important. Now making sure that in this architecture, the security and the privacy, as I talked earlier, are leveraged and the best practices are leveraged, such as Istio for service mesh, OAuth for API authentications, RBSC data encryption, so on and so forth. Now, when you look at the vector database, it runs as a stateful, Kubernetes service and with persistent storage. Now, LLM hosting and LLM inferences are very important, which is based on the GPU nodes or optimized inference service such as Azure ML or AWS SageMaker, a lot of things. And then Grafana API latency and ELK Stark also, you can leverage that. Now, Where are we moving? What are some future trends? and these are the kind of four things that I mentioned. One is we are moving towards in a kind of a, a power cloud automations through self optimizing cloud architecture. This is really happening now. The second is, you know how we can make sure that no. We can process AI closer to the data source, right? This is really important. LLM Ops is another really important piece, as we are trying to optimize our generative AI workloads, LLM based workloads and finalize the quantum computing in AI and emerging potential of quantum. Enhanced AI workloads and that is also another future trend that I see look, this is a very small quick 15 minutes presentations, but definitely you know what I would like to see and what we have seen working with so many Fortune 500 organizations in the adoption is really happening. The leaders in which have already established certain foundation on their infrastructures are really accelerating their business value. but also the firms, the organizations, which have started to realize that, okay, we have to leverage the cloud native applications to build a scalable data and AI applications. they are also definitely investing, into their capabilities to make sure that, they will stay ahead of the curve and they will remain, into their competitive advantage positions. That's it. Thank you very much.

Slides

Download slides (PDF)

See all 81 talks at this event!

Conf42 Cloud Native 2025 - Online

March 06 2025 - premiere 5PM GMT

Building Scalable Data and AI Solutions with Cloud-Native Architecture

Video size:

Abstract

Summary

Transcript

Slides

Meethun Panda

Associate Partner @ Bain & Company

Join the community!

Featured event

2025

2024

Info

Conf42 Cloud Native 2025 - Online

March 06 2025 - premiere 5PM GMT

Building Scalable Data and AI Solutions with Cloud-Native Architecture

Video size:

Abstract

Summary

Transcript

Slides

Meethun Panda

Associate Partner @ Bain & Company

Join the community!