Conf42 Cloud Native 2022 - Online

It's all about the Data

Video size:

Abstract

Data Management is required across the board when it comes to any platform, we could be talking about Virtualisation, Cloud (IaaS, PaaS, SaaS, etc), Cloud-Native, and Legacy and sometimes all of these platforms are linked together to serve the end-user. Data Management consists of many different facets including Backup, Recovery, Migration, and also leveraging that data as part of another use case that does not interfere with the production environment.

In this session we are going to focus on protecting stateful workloads in your cloud-native Kubernetes environment, the importance of making sure your data services are protected but also the capabilities available to enable easy migration between multiple different Kubernetes clusters and environments. Database not running in Kubernetes? That is fine we also have a unique way of being able to protect your data services that reside outside of the Kubernetes cluster.

If we have time, we can also touch on the ability to add this to your continuous deployment process to ensure that your data service is also protected before any code change.

Summary

  • Michael Cade is a senior global technologist at Kasten by Veeam. He focuses on data management around kubernetes. He wants to debunk some of those myths as we go through some of this. Any questions, please let me know either in the chat function or find me on social media.
  • Data is the new lifeblood of everything that we're doing. Regardless of where your workload resides, data is probably the most important thing. None of these platforms have gone away. We still have the requirement for physical systems, for virtualization, for cloud and containers.
  • Kubernetes is focusing on that storing and protecting your data via backup and restore. Another key area is disaster recovery. How can we move that data wherever we want?
  • The first demo uses a three node cluster with a web application and a database. The next demo will talk about data outside of the Kubernetes cluster that still needs to be protected. In my opinion, you're going to want to be able to capture that whole application.
  • Kubernetes leveraging persistent Persistent Persistent volumes. Can bring Pacman, that whole application in a consists fashion over into my EKS cluster with all of the transformation that I need to get it over there. Can also transform what that looks like.
  • Using config maps and secrets, we can marry up the Kubernetes cluster with RDS. This gives our node JS application access to that database. We can then use that data to recover from like we saw in the previous demo. There's a freedom of choice when it comes to where you want to run your workload.
  • I think another misconception is around that stateful workloads within Kubernetes are the only ones that need to be protected. Many of the different applications that you maybe consider to be a stateless workload still have some sort of data that you would still like to retain. And finally how to get hands on with casting k ten.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hey everyone, I'm Michael Cade. I'm a senior global technologist at Kasten by Veeam, where we focus on data management around kubernetes. But in particular, my background comes from completely data management, whether it be physical, virtual, cloud case workloads, and now even more recently is that requirement around having to protect workloads in a Kubernetes environment. So I just want to debunk some of those myths as we go through some of this. But then I've also got some demos, pretty cool demos as well, to actually debunk those myths. Any questions, please let me know either in the chat function or find me on social media. Email address is also down below. So I think we've all seen a slide that resembles a lot of this information around data, whether it be data on social media, whether it be machine learning from autonomous cars, whether it be photos, digital footprints from a personal and business point of view. But ultimately, we know that data is growing and really it doesn't matter where the platform is or what we're seeing out there around that. And I think it's also important to remember that the data is the new lifeblood of everything that we're doing. It's the common denominator. So regardless of where your workload resides, whether it's a web page or whether it's an actual complete system, databases, et cetera, that build up your whole system that your company uses or your customers use, then the data is probably the most important thing to you at the point, whether it be a virtual machine, a cloud based solution, SaaS, paaS, et cetera. But data is the common denominator, and we're seeing that across the board when it comes to failure scenarios, most dominated by things like ransomware in the ecosystem at the moment, we're seeing a lot of that. And the important thing to note is that all of these options that we have, whether it be physical machines, whether it be virtual machines, whether it be virtual machines in the cloud, or PaaS, SaaS based solutions, but then even more so, containers and container orchestration engines, the data still has to live somewhere, whether that be inside these platform. So a virtual machine with SQL on or MySQL or NoSQL, or whether it's a cloud based workload that's running RDS within AWS, or whether it's a container running a stateful set which has your data service residing there within the same cluster as maybe your front end website or front end application, or whether it's taking advantage of data outside of a cluster maybe leveraging that RDS approach as well. But either way, we need to think about the data protection and the management of that and make the correct choices, because none of those platforms are being to stop the failure scenarios, the accidental deletions, the malicious activity, both internal and external, the ransomware attacks, the security breaches, it's not going to protect against that. Yes, we have high availability, we have fault tolerance across many of these different platforms if protected correctly. But one thing that is for sure is that we don't necessarily have backup built into those platforms as well. And that's where we want to highlight that and raise awareness of what that is and what it looks like. Another important thing that I've been saying for a long time, you see a lot, if you went onto your favorite search engine and you did continuous and VMS or continuous versus VMS is what you're going to find. And actually that shouldn't be. The message that's portrayed out in our industry is that none of these systems have gone away. None of these platforms have gone away. We still have the requirement for physical systems, for virtualization, for cloud and containers. It's about having can awareness of what each one brings so that we can make the right decision for our application and for our data within our businesses. None of them are going away. Yes, we saw a massive consolidation of physical systems into virtual, but ultimately there is still the requirement around physical systems. But what then that does this freedom of choice as to where we can then store it? And obviously there's a lot of technologies built on top of these platforms or the areas that I just touched on, but we have to make that decision. And if you only know about virtual and physical and maybe a bit of cloud, as a systems administrator, DevOps engineer, platform engineer, et cetera, you're going to tend to go with what you know, whereas what we're trying to do is raise awareness of these other platforms out there in other sessions to let you know that actually you might be better off using kubernetes or containerized workloads or the cloud rds and things like that from AWS. So one of the key aspects that I kind of want to get across today is just because the same but different is we're going to focus in on kubernetes and data management and in particular is focusing on that storing and protecting your data via backup and restore. So we've been doing backup and restore for a long time, obviously way back from physical systems point of view with an agent, virtualization. We came along and we started hooking into the native APIs cloud exactly the same. And now here we are with kubernetes. So it's about choosing the right tool for the right job, but also being able to leverage some of the platforms underpinning APIs to be able to take a more efficient and fast way of being able to protect that data, as well as being able to restore. No one really cares about the backup, you only really care about the restore, but you have to back up to be able to restore. Another key area is disaster recovery. I mentioned around oh, we've got high availability, we've got fault tolerance built into these platforms. However, disaster recovery is not built into them. So we have to think about what happens if the failure scenario of fire, flood and blood, what happens if we need to bring up that data somewhere else, that mobility of data. And then that leads me on to, well, there's other use cases then that get highlighted from a data management point of view, is from an application mobility point of view, how can we move that data wherever we want? And whether that is to reduce risk, whether it is to increase efficiency or whether it is to reduce cost, one of those three things or more is going to impact the business in a positive way or a negative way if it is not able. So one of the things that we've been massively promoting from a casting k ten point of view is the ability to move data from a to b and back again if need be. But also think about things like being able to clone that copy of that data and be able to put it to work around that. So let's get straight into the first demo. Now, what I've got here is a very simple environment, and it actually leads on from another demo that I did in another session that built out this environments. But for the purposes of this demo is we have a three node cluster, we have a control plane, we have a node one and we have node two. We have a service within our Kubernetes cluster. We have a web application which is written in node JS, and we have a database that is using MongoDB. And within that we have a persistent volume claim that is using the CSI driver to use the host path driver here in this instance. But what this demo is really highlighting is we are using everything is but into the same platform, into this container, because the next demo is going to talk about data outside of the Kubernetes cluster that still needs to be protected. But how do we concentrate on the whole application? Because it's all well and good being able to take a copy of that persistent volume claim, which some other subsoftware can do, and that might be enough for your requirements when it comes to that failure scenario. But in my opinion, you're going to want to be able to capture that whole application. You're going to want to be able to restore the service, the web app, the database, the ingress that goes with that, the persistent volume claim that goes with that, all of the external data that lives in that mongo database as well. So let's get into that. And I'm obviously using my mission critical application Pacman for those that want to see this, this is open source. It's out there as a helm chart now as well as well as deployment yamls as well. So mission critical application, front end node js written in is acting as my front end back end database where all my mission critical high scores are living, is on a persistent volume. So let me try and. Okay, so it's kicking off. So what we're going to do is we're going to play a quick game. Let's rack up one of those high scores. And if you happen to be watching this and you go to app Vzilla co. UK, depending on what other demo I'm doing at the time, because I tend to use that DNS name quite a bit, you might find that you can have a play. As you can see here, I have a lot of very important high scores across different Kubernetes clusters as well. You can see that we pick up some of that important information as we go. Now I didn't log a score that hit on the high leaderboard. So if I now go into, and this is talking about the application mobility as the rest of the demo. So here we have our Pacman namespace within our Kubernetes environment. But now if I go and switch to a secondary, maybe a disaster recovery or maybe just a secondary system in particular, I'm going to go to EKS, AWS, EKS. And now if I go get namespaces, now you can see I don't have Pacman here. I have custom running, I have custom running in both, but I don't have the Pacman namespace. And what I want, I want to be able to bring that data, that important data, I want to bring that to my EKS cluster. Now that could be a migration, that could be disaster recovery, and it could also be a clone. Like there might be a service within EKS or AWS that I want to take advantage of. That data could really do with some of the services that are native there versus it being in GKE or AWS. Now another thing that we can do with that restore policy is being able to transform what that looks like. Because what we had in the primary cluster, we might be using storage type a and we might be using a load balancer. But where we go to eks, I want to change that because I want it to come up on a different storage tier and so on and so forth. So I'm now running through this restore policy that can be scheduled and we can automate that. You saw the frequency on there and then we've got this import policy that we're now going to run within that. So if I jump into my EKS cluster and that was a snippet just before where you can see all of my clusters, that's a snippet of k ten and its multicluster capabilities. So now I'm in my EKS cluster and I'm running that import policy to bring that Pacman, that whole application in a consists fashion over into my EKS cluster with all of the transformation that I need to get it over there. So if I now go and look at that namespace which by magic is now being created, but also now you can start to see the restore configuration, you see a deployment for both Pacman and Mongo. And now if I go and check the services that we have within there now, this won't be, but it could have been. I could have decided if this was a migration, I could decide to change that DNS entry from apps Vzilla Co. UK, which is going to another forward facing IP address on the Internet or a DNS IP on the Internet. Now we're going to go to an AWS session. You can see up in the top it says AWS. And if I go into this, so I've restored this now and it's running in eks now. The most important thing is, do I have them mission critical high scores? Yes, I do. Everything's in there. All good. And that's exactly what we wanted to show. So this just highlights a few of those areas. Yes, backup recovery is super important. It's kind of a table stakes, but you got to do it in the right way. K ten lives within the Kubernetes cluster, so it leverages that API so that we're more efficient to be able to capture these whole application. You can see there that we've just shown the completed successful run on that. Let me jump in. So what we just spoke about was very much storage within Kubernetes leveraging persistent volume. Persistent volumes. Persistent volume claims and can external storage volume. Now this hasn't always been the case from a Kubernetes point of view. Now this is what we just did. We had a stateful set. In fact I think it was a deployment, but it was using a persistent volume claim. A persistent volume and our external storage volume. I'll work backwards on this. So we have the container storage interface. Now the CSI driver enables all storage vendors to write against the framework that Kubernetes has developed or the community have developed, so that we're marrying those up. So functionality with storage vendor functionality, which means which is better than the intrigue provisioner which was waiting for the whole code release every single time Kubernetes was released. So without going into too much of this detail, all of these will be available afterwards. But basically what this means is that whatever the storage vendor that you're choosing here, it means that we've got the ability to use that within our Kubernetes environment if we have that CSI driver compliant way of being able to leverage that, as well as things like volume snapshot classes, which is what we're going to use to take a very efficient point of recovery. I wouldn't say that snapshots are the only point of recovery that we should have and we should have an export out into object storage or another storage layer, but it gives us a way of being able to recover super quick into the live production system if accidental deletion or something very tiny was to happen with that failure scenario. Now the next one that we want to talk about is what if we've got a data service that is actually running external from the Kubernetes cluster, but maybe our node JS front end is running in Kubernetes, but we've had a database server that lives on a virtual machine, in a cloud virtual machine, or we're leveraging RDS, how do we get access to that? And we can do that as well with Kubernetes using config maps and secrets. And what that allows us to do is marry up the Kubernetes cluster or use the Kubernetes API to access that, thus giving our node JS application access to that database. So we're actually seeing this quite a lot within the environments. Okay, so as you can see here, I have the RDS. This is actually a postgres database within my RDS cluster. You can see where it's located, et cetera. So think of this, this is where all my mission critical application is going to be living. And in our Kubernetes cluster we have a namespace called RDS app. And if we then go and take a look inside of that and we have a config map that is saying how do we connect our application to our RDS instance? And we also should have a secret in there as well which has given us the DB creds into as you saw on these slide before. So I've also got cast and k ten deployed and in here I've already got a policy created. Now if I hit this run once and I'll come back to what that looks like shortly. So if we now go into this, basically what this is doing is we're giving it a name, we're giving it our comments. If we want to give it a description, we're saying what we want to do with that, it's an action snapshot. And we're saying when do we want to run it? So I could have it on a backup frequency or just have it on demand, then I'm saying how do I. We can just take a snapshot but then we can also export that out into a separate location, an object storage location. In this instance I'm sending it to AWS S three. And there's a few more options around this. So we get to choose what application we actually want to use. We can do this by a namespace or we could do it via labels. And then what application resources you want to actually specifically capture. Now I want to just do everything in here. And we also have something where we dive into the postgres database or any data service that enables us to coerce that workload comes that application data so that we've got a consistent copy of that. So that should be running and it takes a little bit of time. So I'll probably speed this up. But if we then go back in here and we do a refresh, we should start to see that we've got this backing up status. And then what we'll do is we'll just wait for this to complete. And then we've got the ability to use that data to recover from like we saw in the previous demo in exactly the same fashion, this dvd rental database. And I can actually restore this into an EKS stateful set. And again we've got these same database, these that we've recovered into. But let's just make sure that all of this works. So backup is initially taking that first snapshot which is here. And then what we're doing after is we're going to export that out into our object storage. This is also currently running in a. So my Kubernetes cluster is a GKE cluster and we're connecting our application into RDS. So there's a freedom of choice when it comes to where you want to run your workloads and where you want to run your data as well. For that use case where data doesn't always have to reside in the same location or in the same platform that your application is, it might be that you're using kubernetes from a compute point of view in a certain geo or a certain cloud provider that you wish, but then you're leveraging something like a PaaS solution as rds to give the best option for the data. Okay, so that's everything complete. And if we go back into our RDS, we should see that we're now back to being fully available. Although we didn't take anything offline and again in postgres we've got the ability to see that database and these from a previous restore. We've got that in an eks cluster. Now from a k ten point of view, if we go back to the dashboard, we go into our applications, we have several restore opportunities, either from a snapshot or from the exported copy where we can say okay, something bad has happened, I want to restore that. So we can then start to say, okay, I need to restore this. I want to restore everything that goes with that back into our environment. So I think from that point of view, obviously data can reside anywhere, but you still need to be able to protect that. If you just used a point solution that was protecting Amazon RDS, then you wouldn't have any idea about the whole application. So if you had to maybe capture some of that dvd rental front end application, maybe it's not just built of a front end and a back end database. Maybe there's other microservices that build up of that application. Maybe there's some sort of metrics and login that we also want to capture in a consistent fashion. So just capturing that point or that RDS instance is not going to be enough. And that's kind of the same ethos as we can protect the potential lun that's coming out of the storage system and you could just protect that. Take a NAS backup, do something with that. But when it comes to restore, what does a restore look like then I'd rather have everything as an application recover as an application. And these get granular about how you recover that. I think another misconception is around that stateful workloads within Kubernetes are the only ones that need to be protected. However, I could argue that many of the different applications that you maybe consider to be a stateless workload still have some sort of data that you would still like to retain, whether that be just simply logs, logs of visualization of those, but also complex environments being able to protect those. And if you've got more than 100 or 200 namespaces full of different applications in your Kubernetes cluster, that's another use case is yes, you might have the actual source code and be able to recover very quickly, but what if you don't know which one it is? And then we've got customers doing the same thing there. This is me going back to that point, about all of those different platforms are still available. And when we get to Kubernetes, there's no shining light, there's no shining green button to say, oh, we don't have to backup stuff anymore, everything's sorted in kubernetes, they've fixed it all. That's not the case. We still need to recover failed applications. There's still accidental deletion, it's still a database at the end of the day. It still requires that application consistency so that we can recover that. And more to the point, that data is still the most important asset probably within your business as well. So we want to make sure that we're covered so that we can recover from any of those failure scenarios. Just some of these challenges that we have protecting these persistent storage complex stateless environments, protecting individual stateful workloads, these logs and other areas. Application consistency is one that I don't have on here, but things like stateless configurations around load balancing, that IP address that I first mentioned in the first demo could be a huge savior if you're having to recover across different geos or across different cloud. Just making sure that we can update the DNS as part of that process. I've mentioned this all the way through. The approach has to be on the application. Now it says Caston's approach. This should just be any data management. If you don't use Caston, that's absolutely fine. I mean, we have open source tool sets as well that look after the application data in canister, but we need to be looking at the whole application. So that includes the ingress, the service, the pods, the services, the stateful, the config maps, the secrets, et cetera. All of the persistent volumes. It needs to contain all of them together so that we can recover them all together, and then the freedom of choice, like wherever kubernetes can pretty much run anywhere. So we need to be able to protect all of those different areas, be able to run on all of those. But also no database is the same. Like we might be running postgres. You saw me running postgres and MongoDB in this demo. But maybe we're using elasticsearch for our login and metrics, maybe we're using different tool sets for other areas of our data services. So being able to protect all of those across different distributions and then under different storage gives us the ability to have that freedom of choice when it comes to that. And then finally how to get hands on with casting k ten we have some lots of learning resources at learning casting IO. I believe that QR code should take you there. If it doesn't, it will take you to a hands on lab that is very similar, and that means that you don't have to go and spin up your own cluster so that you can get hands on and see what it does. But yeah, with that, I'd just like to say thank you very much for sticking with us. Hopefully those demos are useful. But yeah, please reach out if there's questions at all and enjoy the show. Thank you.
...

Michael Cade

Senior Global Technologist @ kasten by Veeam

Michael Cade's LinkedIn account Michael Cade's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways