Conf42 Cloud Native 2023 - Online

Just-in-time Nodes for Any AWS EKS Cluster - Auto Scaling with Karpenter

Video size:

Abstract

Karpenter simplifies K8s infrastructure with the right nodes at the right time. It automatically launches just the right compute resources to handle your cluster’s applications. It is designed to let you take full advantage of the cloud with fast and simple provisioning for Kubernetes clusters.

Summary

  • Today we are going to be discussing about Carpenter. Nodes for any EKS cluster auto scaling with Carpenter. The name of my presentation is just in time.
  • What is EKS elastic Kubernetes services on AWS? Carpenter actually works on top of eks. After that we're going to talk about kubernetes auto scaling. And in the end we spend probably 1015 minutes doing a demo.
  • EKS is short for elastic Kubernetes service. It's a managed service on AWS. EKS provides an experience for reliability, security, availability and performance. For doing know lifecycle management, scaling, those are actually going to be responsible for you behind the scenes.
  • Kubernetes autoscaling focuses on the application itself and the other one is the nodes and the infrastructure. Customers are moving for a variety of different workloads. And that's where some challenges come into the picture.
  • You can reschedule running pods onto existing clusters that are underutilized. You can also launch new more cost efficiency nodes within the cluster. consolidation doesn't ever bring your application down, but that actually optimize capacity quite a lot.
  • Carpenter works by looking for pending pods and grouping those pending pods. Instead of talking to Karpenter, carpenter talks to EC two fleet instance. This can shave seconds off nodes per node startup latency.
  • You can create multiple provisioning with different weights or you want to match your specific pod to a specific provisioner. You can also restrict instance selection by diversification across different configurations. If you're looking to implement Karpenter, you should be familiar with and or at least evaluate.
  • Karpenter is a tool that allows you to scale up and scale down applications on easy to instances on your AWS account. The demo shows in real time what is the currently state of my eks cluster. Only one pod is responsible for making those decisions and making the scaling actions.
  • The other object has deployed is called the AWS node template. This is responsible for telling a carpenter how do you actually deploy new instances. What I want to do here is scale this up for us.
  • So I'm going to enable consolidation. What I want to do now is deploy three replicas of my inflate application. Before it's even ready on the Kubernetes, Carpenter is already making all the work behind the scenes. It's taking a few minutes to create a new instance.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hello everyone, thanks for joining my session. My name is Samuel Baruffi. I am a solutions architect with AWS and today we are going to be discussing about Carpenter. The name of my presentation is just in time. Nodes for any EKS cluster auto scaling with Carpenter. So let's just get started with a quick agenda. We're going to, at a very high level, discuss about EKS. What is EKS elastic Kubernetes services on AWS and we just want to set the tune and set the page, set the understanding of what is EKS because Carpenter actually works on top of eks. If you're not familiar with EKS, you might need a little bit of understanding of kubernetes, but hopefully the quick overview will be able to provide that guidance. After that we're going to talk about kubernetes auto scaling. What are the mechanisms that both on cloud native Kubernetes native are available for us, but also what are the currently implementations for cloud. After that we're going to talk about some customer challenges based on those implementations, current implementations, and then we're going to talk about Carpenter and how Carpenter solves some of the challenges that we've heard from customers trying to do autoscaling on kubernetes. And in the end we spend probably 1015 minutes doing a demo, installing Karpenter and actually showcasing how carpenter can help you with a lot of flexibility and speed to scale up and scale down your clusters, your specific nodes within your clusters. So moving forward, let's do an overview of EKS. So EKS is short for elastic Kubernetes service. It's a managed service on AWS. EKS actually runs on vanilla upstream Kubernetes. It's also certified Kubernetes conformant for specific Kubernetes versions at any given time. EKS currently supports four versions of your Kubernetes, which gives you as a customer time to test and roll out upgrades. Having a lifecycle management of upgrades on your Kubernetes clusters is really important and AWS helps you with that because it's a managed service. EKS provides an experience for reliability, security, availability and performance on top of eks on eks. On the next slide you see how you have data plane and control plane that can be managed for you on both sides. And the whole idea is by using EKS you don't need to do a lot of the operations and what we call undifferentiated heavlifting for managing your Kubernetes clusters, you can just rely on a managed service like EKS to take care of those tasks like upgrades, lifecycle management, security and so forth. Of course, it's always a shared responsibility that some of the things will be taken care by AWS. And some of the things it's your responsibility to proper configure, giving you the proper flexibility. So when we look at a high level overview of what EKS is, you have two boxes here. The first box that we're going to talk is the control plane. So when you look at the box on the right which says AWS cloud, it means that it's running behind the scenes by AWS. And here on the top you can see that the control plane, which is a fully managed single tenant, kubernetes control plane per cluster. So once you create your EKS cluster behind the scenes, AWS is going to create a single tenant only for you control plane. And you're only going to get the specific endpoint. You can create private or public endpoints, we know we call them cluster endpoints. And behind the scenes, if you're familiar with Kubernetes architecture, you have the ETCD database, you have the API, you have the schedulers and you have the controller. AWS is going to manage the control plane for you and not only manage, but scale as needed. So you don't need to worry about that. It's all taken care on the control plane side by AWS. Then when you look at the left box, you see that the customer VPC. So that's the virtual private cloud that you have on your AWS account and that's where you can deploy your data plane. So the data plane means that those are the nodes where your containers, your pods are going to be running on. You have kind of two types of node groups that you can create it. You can have a self managed nodes group and a managed node group. With self managed node group you're actually responsible for all the configurations for your altiscaling group, for managing AMI and everything else. With manage no groups you have a managed experience for your data plane as well. So for doing know lifecycle management, scaling, those are actually going to be responsible to take those actions for you behind the scenes. You can actually also use forgate. Forgate, it's a serverless container offering that does not require any specific nodes group in the sense that you don't need any EC two, neither a self managed node group or a managed node group. But with forgate you pay per pod and that specific running pod behind the scenes is running on the AWS account. So you can see here that the way it works, it creates an Eni within your VPC that links back to the Fargate micro VM that is running on the AWS cloud. Fargate also works on ecs but has integration with eks like you can see here for this talk. We're not going to focus on too much on EKS data plane or control plane. We are going to talk about EKS auto scaling and Kubernetes auto scaling. So with that said, let's move to the next section when we look at different so what you as a customer or a user of kubernetes, what are the available resources and configuration that you can fine tune for autoscaling? So you're going to start at the application level so you can separate autoscaling and kubernetes at two different categories. One is the application itself and the other one is the nodes and the infrastructure. So the first two items are more focused on the applications that are running. The first one is called horizontal pod outscaling. HPA is the short version of that. The whole idea of HPA is you do a deployment on your cluster and you decide how many replicas of that specific deployment you want to have. Let's say I want to have an Nginx server and I want to have three replicas of that specific Nginx pod to be deployed across my specific environment. You can configure HPA on top of that deployment, and you can specify specific metrics, for example cpu, memory, or even have your own custom metrics. And once you do know, HPA is going to look for that metric. And if a specific threshold that you have configured goes above. So let's say you configure that if at any given time the cpu aggregation of your deployment goes above 80%, it wants to increase to another pod, another replica within your deployment. So HPA is going to take that job for you and it's going to just add horizontally more nodes for as much as you have configured and for whatever metric you have configured. So that's what is called HPA. But you also have another option which is called the vertical pod outscaling, which is VPA is short for. VPA is less common in a sense, because Kubernetes is really good at distributed systems horizontally. But you also have an ability to actually change a pod that is running, for example, with 2gb of memory. But if a specific threshold has been achieved, you want to create a new pod with 4gb of memory. So the same things that works on HPA. Now, instead of adding new replicas on your deployment, it's just going to recreate a new pod with more memory available for that specific pod. But those are always looking at your application. Those two configurations, both HPA and VPA, don't actually look at the cluster itself or add more infrastructure nodes. You only do at the application level. So you actually need to. If you want to run a flexible and elastic Kubernetes cluster, you also need something that is responsible for creating more nodes for you. With that said, that's where cluster altoscaler comes in. So with cluster outscaling, if let's say you have two nodes on your data plane and you try to schedule in this example, four more pods, but there are no resources available within those existing nodes on your node group. A cluster out scaler once you install and you configure and integrate it with your provider, let's say in this case AWS cluster outscaler will look for penning pods and we say, wow, I don't really have resources currently available for me to deploy those four penning nodes. So the cluster outscaler will go and we talk to the altiscaling group as part of your node group, either a self managed node group or a managed node group. So the cluster outscale itself will go and you talk to the API of your altiscaling group and you say please spin up a new nodes or a new EC two for me within that specific outscaling group. So then I can go and actually schedule and run all my four nodes that were penning. So behind the scenes, each outscaling group will increment the size based on the recommendation of the penning pods. This works fine for most applications and workload. However, as kubernetes and eks have grained, broader adoption customers are moving for a variety of different workloads. And as you can see in this example, it's actually just creating a new instance of the same instance type within the same auto scaling group. And that's where some challenges come into the picture. So what we've heard, we've heard some customers bringing some feedback and saying why potentially cluster autoscaler doesn't work every single time, or there is potentially improvement that should be made. So nearly half of AWS Kubernetes customers have told AWS that configure cluster autoscaler is challenging, and we are going to now just go through some of those challenges to set a scene on why it was important for us to create carpenter. So first of all, no groups and autoscaling group sprawl different workloads will need different compute resources. So AI ML workload, we have a very different requirement than for example your web application or your batch applications, right? Unfortunately, with cluster auto scaler, the only thing that the cluster outscaler is able to do is to add new instances of the same type on your existing manage nodes group. You can create multiple manage node groups with different instance types, but that adds a lot of complexity in managing those, right? So what customers have told you that not all workloads needs to be isolated on specific nodes groups, and balancing the needs of specific workloads adds a lot of complexity because now you need to manage multiple outscaling groups and multiple managed node groups and it becomes just cumbersome and it's really hard to achieve proper performance for cost and also for availability. As an example, if you need spot, you can't mix and match in a specific managed node group is spot and on demand. You need to have multiple managed nodes groups that behind the scenes each one of them have an auto scaling group and it becomes really challenge how you actually provide availability for spot interruption or best practice for for example, spreading workloads across the z while always thinking about cost and trying to improve cost and performance for those workloads. Another challenge that we've heard from customers is cluster outscaler actually sometimes can be very slow to respond for capacity needs and spike workloads. So if you think about ETL jobs or GPU training jobs or ML workloads, the speed that it's required for those workloads, like big data and AML workloads to be spun up is critical. Delay in providing those capacities for these workloads can slow down innovation and potentially decrease the satisfaction of your data science and engineers. This job typically spin up several nodes of expensive accelerated EC, two instances, for example, GPU, very expensive gpus. So you want those to be very quickly spun up, but also very quickly spun down. And a slow scale down means waste resources, which you don't want to be in the business of. Another challenge. It is very hard to balance utilization, availability and cost. So typically with cluster altoscaler is hard to get high cluster utilization and efficiency operation while not over provisioning resources to ensure a consistent user experience. So what this can result is in a low utilization and lead to waste resources that impact, which can be significant, which the impact can be significant. So as an example, let's say you want to make sure your application is running across multiple availability zones, but have a different resource requirement. Then you potentially need to have multiple auto scaling groups. And that adds just a lot of challenge managing those auto scaling groups that are across AZ and you want to make sure that they are fully utilized, that becomes very challenging, sometimes potentially impossible to not have wasted resources. So with all those three challenges we so far have discussed, we have come up with Carpenter. But what is actually carpenter. So carpenter, it's a open resources, a flexible and high performance Kubernetes cluster altiscaler. So instead of actually deploying cluster altiscaler, you can actually deploy Karpenter on your eks. It is open source and Kubernetes native. It doesn't have any concept of group. So it's what we call a groupless approach. And we are going to talk about in a moment why it's called groupless, but it's pretty much automatic node sizing. So instead of having a specific requirement or a specific, I guess, blocker of just being able to launch instance of the same type that you have on your altiscaling group with carpenter, it can look at the specific requirements for the painting jobs and choose the best performance and costs for that specific need at any given time. And it's also much more performant at scale because it has some changes on the way it behaviors compared to cluster autoscaler. The way APIs and the way it's actually looking for pending pods on your cluster is a little bit different. The goal is to launch 1000 pods within 30 seconds. That's the goal that carpenter has set in mind. And it can, depending on your environment, actually achieve that. So let's look at how Karpenter works. Very similar to cluster outscaler when you have penning nodes. Karpenter, we're always going to look at the schedule on Kubernetes because he works integrated into Kubernetes native ecosystem. Look for penning pods and those panning pods looks at existing capacity in this case and see, well, I can't actually deploy more pods because it's full here. So penning pods becomes unschedulable nodes and then that's where capital comes in. So this would actually replace your cluster out scalar. You're not going to have in this case cluster outscaler. Here you have carpenter deploy and Karpenter. We actually go and instead of talking to an is because there is no groups, we go and we talk to the EC two fleet API. And the EC two fleet API provides a bunch of benefits. And behind the scenes, what carpenter does, it looks at the specific requirements for those unscheduleable nodes and will find just in time capacity that is perfect for what you need. So you can configure and it's very flexible what type of configuration allows you to do. But if you don't provide any specific limitations of the instances that it can choose, you'll find a specific instance that can fit your requirement while also improving for cost and performance, and you make that decision for you. So it's also deeply integrated with Kubernetes. So you look for watch APIs, you have a lot of labels and finalizers, and like I said, it does a lot of the automated instance selection. So you match a specific workload to a specific instance type. Carpenter also works really well with spot. So if you configure a specific provision, and we're going to talk in a moment, what is the provision? But if your provisioner has support for spot and you can mix and match spot and on demand, you can have both being supported. He can actually look for what cheapest and most performance spot is available on this currently specific availability zone that he wants to deploy and actually pick and choose that and deploy those for you, and it can also hand interruption for you. You actually have integration with the two minute interruption system that spot has in place to allow your applications to potentially be rescheduled before the node interruption is in place. Another really good thing that carpenter has actually done is the ability to consolidate. And consolidation is a feature that actually looks for opportunity to improve your cluster utilization over time. So carpenter does not also works on scaling up and down, but also look at your cluster high level and look at which current nodes you have in place. And if there is potentially an opportunity to maybe remove some of those nodes and bring up other nodes that are more performance and a price optimized for you. So you can reschedule running pods onto existing clusters that are underutilized at the cluster capacity, but you can also launch new more cost efficiency nodes within the cluster and replace potentially nodes that were much more expensive. So let's say in this case here you have three nodes that are potentially m five large. Let's just give an example. Carpenter will look at that saying, well, I can potentially spin up an m five two x large and maybe then an m five medium and actually bring all those capacity to those two instances that are going to be more performance and better for your price. So that consolidation is a feature that you can enable. If you don't want the feature, it's okay because it's actually going to reschedule the pods and you want to make sure your application is highly distributed within your cluster. So consolidation doesn't ever bring your application down, but that actually optimize capacity quite a lot. So when we look at how Karpenter works, how carpenter provisions a node on AWS. So let's first look at how cluster autoscaler does so. Cluster autoscaler, we look at, let's say you have an application scheduler or HPA is triggering a specific pod to be created. It gets into a pending pod scenario, then the cluster altiscaler will actually look at those pending pods, will talk to altiscaling group and then outscaling group will talk to EC two API to increase or decrease whatever in this case increase because you have pending nodes, increase the number of nodes that you have on your node group. Now the way carpenter works is instead of having to talk to closer outscaler and the specific outscaling group, penning pods will actually talk directly. A carpenter will watch for those penning pods. Those penning pods will then actually set an action on Karpenter. And instead of Karpenter talking to EC two API, carpenter talks to EC two fleet instance, which is much more performance when you're trying to grab what is the capability and possibilities that carpenter can deploy on a specific availability zone in a region. EC two fleet is the one responsible on the AWS side to make those decisions and consolidation instance orchestration responsibility within a single system. It's what Carpenter does, and we've talked about groupless provisioning. So what actually carpenter does, it's an application first approach because it's always looking for the pending pods and grouping those pending pods, which is called beam packing. So every time there are pending pods, you're actually going to beam pack all those panning pods and look at what is the simplest node provisioning that will make sense for your cluster. And the way you configure, compute, you'll see on the demo. It's really simple. You just have two objects. One is the provisioning and the other one is the node template for your specific provider, in this case AWS. And it reduced a lot of the cloud provider API load because now it's going directly to the EC two fleet API. It doesn't have the limitations that how many times you can call out scaling groups which the cluster altiscaler had and it reduced the latency significantly. So you choose the instance type from the pod resource request. So when you're doing deployments, you always want to make sure you set the requests of memory and CPU. And that's what carpenter we actually use to look at how much memory and cpu requires. Then you choose the specific node as per pod scheduling constraints. So if you have constraints of specific availability zones that you want that pod to be deployed, or potentially some specific instance types or gpus, it's going to look at the pod deployment like labels. And then carpenter read that and make a decision based on those constraints. And that capacity is directly done on EC two instance fleet. And then you track the nodes using native Kubernetes labels and you also bind, this is a specific one. It binds the pod early to the nodes because it doesn't need to wait for the cluster altoscaler to make any decision like it was before. While it's actually creating the node behind the scenes, the cube scheduler is already kind of downloading everything that it needs to do and it becomes ready. The schedule for the node becomes ready. It can start preparing the node immediately. It doesn't need to wait much how the cluster autoscaler needs, including the pre pulling of the image. And this can actually shave seconds off nodes per node startup latency. So it just is a very nice feature that helps carpenter be more reliable and fast when actually doing those scaling activities. So let's just quickly look at how carpenter can scale up. Let's see, we have specific panning pods here on the top. Karpenter will look at those panning pods and you create a new node, right? And assuming you have targets here because you have requests set on your application. So he knows how much at both at a node level but also at a cluster level, what is the utilization and the target that he wants to set for a specific one. In this case you can set up provisioners by default. It has all instances, types able on that that are included that carpenter can pick and choose. But you can specify, and we have some examples that I'm going to show of specific instance types that you want to make available. And then when we have scaling, we have different options. We talked about consolidation and we're going to show an example before, but you have two options. You can either use the TTL seconds after empty or you can use the consolidation. They are mucco exclusive. Consolidation is more like a newer feature, but before consolidation existed you have these settings set PTL seconds after empty. In this case, in this example I'm showing you is set as 10 seconds. So what this feature will do, it will look for nodes that are empty. In this case I just removed some pods from my nodes and 10 seconds after, if the node is still empty, you're actually just going to remove the nodes completely. And one thing that I just want to mention is it doesn't actually care about the demon sets because demon sets are looking at every single node. You just look for nodes. So it is smart enough to realize if there are three demon sets running at that cluster, sorry, at that node, it doesn't really cares about that because it knows the demon sets actually running across all nodes. So remove that. I talked about bing pack. The cool thing is it combines all those specific penning pods requirements and has well known labels that you can define on your specific deployment that are actually configured at the node as well. So let's say you want to run this specific application on arm graviton tube processor. You can actually define those specific labels. And when carpeting, doing the beam packing is going to make a consolidation decision on how they can organize all the panning pods you have on the queue. And then consolidation, which I recommend rather than using. There are potentially reasons why you want to use CTL seconds after expire, but potentially consolidation is a much more broader and feature rich solution that allows you, if you enable here on your provision AWS, you see consolidation enable. True, let's say in this example you had five nodes within this node here on the right. What you can actually do once it goes back to Chi, you can see that you have a lot of underutilized resource. Carpenter will look at that. If you have the consolidation enabled and you say, you know what, I can actually run those chew pods in a much cheaper node. So it's going to spin up the node for you, it's going to spin up the node for you, then it's going to move those pods into the new node and finally it's going to remove the old node. So in that environment it actually allows you to delete a nodes when pods can run free on capacity that other exists in the cluster, but it can also delete a node when you don't have a lot of requirement for that big node that you have. And it can just create smaller ones like the one you saw here. That is just a replacing of a nodes, in this case a specific. So continue the information. The example here, you had four nodes on this one, the third node from the top. Now you only have one pod. What it's actually going to do is going to remove from here, it's going to move from the one in the bottom and it's just going to remove. So it keeps an idea of cost optimization, which is really important to scale your Kubernetes solutions. So here is we are just going to spend. A few examples here how you can configure your provisioning. So your provisioner is the Kubernetes object that once you deploy karpenter and you see on the demo you can provide specific configuration, how your provisioner can behave and you can create multiple provisioning with different weights or you want to match your specific pod to a specific provisioner. There are actually labels that you can actually mix and match. So one example for the flexibility of provisioning is the ability to select purchase options so you can select capacity type. In this case you have requirements capacity type. You actually are choosing spot and on demand. When you have spot and on demand configured at the same time on a specific provisioner, carpenter will always favor spot and it's only going to pick on demand if there are spot constraints. So if potentially the good options for spot to launch an UEC two instance for you are not available at that time then it's going to default back to on demand. But you can also select different architecture types. You can have provisioners that can deploy both arm 64 graviton two processors and AMD 64 processors type with x 86. So that means that that provisioner will look for the specific architecture type that your panning pod needs. And if it needs an arm and there is no capacity available on your cluster, it's actually just going to go and deploy a new arm gravitant tube ec two node. But it can also do that for AMD 64. Another capability is you can restrict instance selection by diversification across different configurations. So you can define the size, the family, the generation and the cpus. So in this example you don't want carpenter to spin up instances that are nano tiny small and large. You only want medium x large, two x large for x large. So you can create this specific requirement on your provisioner and then carpet will always look at those. And if you can have multiple provisions, but if whatever specific configuration have a pending pod that has gone to that specific provisioner then you just use the configuration you have in place. But you can also have availability zone. You can say well this provisioner should only deploy new nodes into us West QA and us two b availability nodes. So you can restrict for availability zones. If you have a requirement that you want to make sure your applications are only run or a set of your applications can run and run on this environment. Another thing you can actually do this is just a new specific provisioner. You can create different provisioners. In this case it's not a default provisioner, it's called west zones. And you can say well west zones can only deploy within these three availability zones on us west two region and it can do either spot and on demand. So between this is a very simple provisioner, you just pick whatever instance type is the more performant and available at a time. It's very like it's going to be a spot instance if it's available for you. And you can also isolate expensive hardware. So if you have needs, for example for applications that need a gpu, you can specify which instances you want this specific provision to deploy. So in this case GPU, you just want p three x eight x large or p 316 x large. But then what you do is you create a tent on those nodes. And if you're familiar with tent and toleration it means that only pods that have a toleration to support this specific tent will actually go on a go and be able to request and provision those resources within these nodes. So if you don't specify on your pods or deployments a toleration to support this tent, this is not going to be selected. But that gives you an ability to have different provisioners to fit your specific use case. And this is all declarative using kubernetes, custom resources definitions using crds. So hopefully I provided a little bit of information on carpenter before we do the demo, but there are some takeaways that if you're looking to implement Karpenter, you should be familiar with and or at least evaluate. The first one is if your application can support disruptions in the sense if you have distributed your applications across multiple nodes and availability zones, please use ECG spot instances to optimize for cost because Karpenter actually looks for those node interruptions for spot and reschedule automatically your pods into a new instance that will configure. But of course you don't want to be in that business. If you only have one pod, for whatever reason you have a stateful application that can only run in one pod, you probably want to avoid spot. So it needs to be a case by case. But most of the times if you're using stateless applications on kubernetes you should be using pod and then you're going to use provisioners to ensure that your scaling nodes and spots are actually implemented with best practice by default. Like I said, you can have multiple provisioners, but you should have a default provisioner with a very diverse instance type and availability zone. So if you don't have specific needs like GPU in that example, you can just let carpenter choose what is the best for you given a wide variety that you have configured on default provisioning. But then you configure additional provisioning for different compute constraints like you have a GPU or you have jobs that you want to make sure it runs on specific instance types because of performance or architecture. Then you create those additional provisioners and you link your deployments to those additional provisioners. And of course you want to control your scheduling using Kubernetes native solutions like node selector, topology, spread constraints, taints, tolerations and provisioners. You actually integrate that within your scheduling of your application. And of course you should use horizontal pod outscaler in conjunction with Carpenter. So you have HPA focusing on the application scaling and you have carpenter focusing on the cluster scaling, spinning up instances as needed for specific requirements. And then before we go into the demo, please look at these resources. I'm just going to go quick through them. The first one you have Karpenter webpage, Karpenter Sh, you have all the documentation and everything available there. You have a lot of examples and you go into a lot of details on how carpenter works. Because carpenter is open source, you can just look at the carpenter specific GitHub. If you have an issue, feel free to just create an issue on GitHub. Or if you need some help, the community is always there for helping. You have a workshop if you want to play around on your own with carpenter, you have two workshops here. The first one the carpenter workshops with the ecgspotworkshop.com. It goes in depth on carpenter. So it's a really good workshop. And if you just want a more high level, you can do the eks workshop and go to the carpenter selection and play around with those. And there is a really good 50 minutes video. If you just want to hear from other SMEs on AWS talking about Carpenter, you can just click on that button. And before I go on the demo, the only thing I want to mention is carpenter currently only supports AWS as a provider, but because carpenter is open source, we do expect in the future that potentially other providers can adopt carpenter and also make available for their users to utilize this flexible way of auto scaling on kubernetes. So we'll see you in a moment on the demo. Okay, so let's jump in into the demo. So I have created before a eks cluster. So if you just want to go and take a look at the nodes that I have on my cluster. So I have actually two nodes already created. Those are actually nodes that are managed by a managed node group on eks, those are not managed by carpenter. I just want to start from a clean slate. So you need a place where carpenter will work. So you can actually have Karpenter being deployed on a managed node group. But that managed node group doesn't need shisk or anything like that. So if I go on the console and I just show you I have one managed node group, which two desired instance which are the ones I showed and they are up and running. And if you look at the nodes that I currently have available on this nodes here, nothing fancy, I just have this cube apps view which is a application that I can look at the stats and a nice visualization of my nodes, AWS nodes, each of the specific nodes to talk to. AWS core DNS cube proxy and you know, if I want to use HPA I need metric server. So it's deployed behind here. It's a nice tool. It's called eks node reviewer. It's open resources. You can just Google eks node viewer. This actually shows in real time what is the currently state of my eks cluster. So the one on the top here is the cluster aggregation. You can see the price per hour and the price per month. And below here it's per node which instance type how many nodes are running each of them, instance type, the price if they are on demand and they're ready. You see as I go through and install carpenter, and once carpenter will actually go and deploy things for me, you see that this will keep changing. So that's why I'm sharing with you. So I have everything already set up. I just want to install Karpenter. And so carpenter is available AWS, a helm chart. I have this command here that I'm just going to deploy. What is this actually doing? It's creating the carpenter installation for me. I have already some environmental variables and some pre configuration that I've done. Actually if you want to deploy carpenter with integration with EC two spot, there are a few pre configuration you got to do like creating, you know, making sure you're creating rules for that sqs queue for an event bridge and so forth. Those were already created for me. So I have deployed carpenter and now if I go and I look, so let's look at the pods that carpenter have. So carpenter is deployed within its specific namespace called Carpenter. So if I look at the nodes that carpenter has actually deployed, I can see that I have two. So karpenter doesn't work as a demon set. Carpenter is just a deployment. And if you look here, Kubectl get deployment on the carpenter namespace you're actually going to see that it's just a deployment and Karpenter works in an active standby approach. So there are two pods and they are going to be running across different nodes but only one pod at a given time is responsible for making those decisions and making the scaling actions for me. So the other one will actually take place as the lead if something happens with the first one. So it's important to understand that it's a high availability scenario. It's not using demo set, it's just a deployment. They have both enabled. So what we're going to do now we have created Karpenter but I haven't created my provisioner. And provisioner is what actually is responsible for making those decisions. You saw on this slide before is how you tell which decisions you want to make when there is a specific scaling activity. And then there is also the AWS node template object that is responsible to telling Carpenter how he actually goes and talks to the cloud to scale up and scale down applications on easy to instances on your AWS account. So I have pre configured a very simple example here. Just gonna paste it and I'm going to show it to you. So what I've done so far, I've created, I installed Carpenter and I have deployed a provisioner and I've deployed an AWS template. So let's just quickly look at the provisioner and see what this provisioner tells us. So if I go kubectl get provisioner default OEM that is the name of my provisioner. So what this provisioner is telling me is telling me that every single, this provisioner has a label of intent apps and we'll see in a moment why that is important. You can also create some limits on your provisioner. So the provisioner will keep an account of how much cpu and memory it has controlling and you can define how much memory you want to give memory and cpu aggregated on all the instances that that provisioner will create. What is the limit? So this provisioner will never go above 1000 cpu and eight terabyte of memory. Then I provide a specific name for my provisioner. This is the default provisioner and here I provide some requirements. So I'm saying that for my capacity type I just want to do spot. So this is only going to do spot. And then I'm saying for my instance types I don't want to be nanomic small, medium large. I only want instance to actually be two x large and above. And then for operating systems I only want carpenter to actually deploy Linux and for my app architecture I only want Karpenter to actually deploy AMD 64 instances and the instance category are only CM and R. I know there is a lot here, you don't need to do that. If you just leave all empty on the requirements carpenter will figure out by itself. But I'm just showcasing how flexible and customizable a provisioner can be. And I'm saying that instant generation can only be greater than two. So you won't be able to deploy like an m two. I don't even know if those are still available, but it won't be able to deploy an M two instance if that was available. And then I'm not using consolidation this specific example, I'm just using the TTL seconds after empty. So 30 seconds after the nodes is empty is going to remove and then this is an interesting one that it says Ttl seconds until expire. You can create expiration dates on your nodes which helps you with lifecycle managements. And if you want to do new AMI updates, this is a good selection. You can specify a specific number here and after that number has expired Karpenter automatically move, create a new instance with the new AMI if you have a new AMI in place or use the old AMi if you don't. And then we'll start deploying those nodes from this old instance to the new instance. So we always keep those instance fresh. The other object has deployed is called the AWS node template. And if we go just show Yaml here, you can see that what this does, it is responsible for. So this is the important piece here. So what this does, this is responsible for telling a carpenter how do you actually deploy new instances like which security group do you use when you are deploying those instances, which is the subnet that you use when you deploy these new instances. And potentially you want to select some tags on those instances as well, right? So you can provide this here, there are many more other configurations. You can specify an AMI, a specific AMI here, you can do much more configurations and you can check that out on the carpenter documentation. But let's go and try to do a deployment where carpenter can go and spin up new instances for me. So let's just going to go here and let's just going to create a specific deployment using inflate. So I have deployed, so here I'm deploying this object called deployment zero replicas right now. And I'm telling here on this deployment that I want to select nodes with intent apps. And if you remember carpenter once the nodes are created will have intent apps. So I'm just telling this is just to say please don't deploy this application on the existing two nodes. Deploy with nodes that have this label intent apps and then carpenter will actually spin up those nodes with this label and just doing a pause container and I'm giving one cpu for each pod and 1.5gb per each pod when I deploy. But because of course if we do kubectl get deployment I have zero replicas so it haven't created anything for me. Right so let's just see 1 second deployment. So you see here that I have inflate it hasn't done anything. So what I want to do here let's just go and actually scale this up for us. So let's just create one replica to start with. So let's just go create one replica. So you see now in a moment down below, keep an eye down below you see there you go. Karpenter looked that there was spanning pods because there wasn't any node that was able to sustain all the requirements that those pods had. So now it's creating an L four x large that is its pod because remember my provisioning only said spot it was not supporting on demand. It tells me the price and it's an L four x large so you can see now that it's actually spinning up the node. You take maybe a minute or so to spin up the nodes. Once the node is up and running you see and we can keep an eye actually let's look at the logs so we can look at the logs for let's just do one thing here. Let's look at the logs for carpenter so let's look at the logs for Karpenter and see what logs are telling us. So if you do kubectl logs and carpenter if you see here, let's just wait a moment. You can see yeah so you can see that it started. Let's see where it actually says oh sorry this is actually not so remember I said that carpenter has two pods. I select the pod that is not the lead is the standby one. So let's just select this 1 second let's just select this one. There you go. Kubectl logs M. Carpenter and let's look at the logs so you can see that. Found provisionable pods. Oh sorry yeah, found provisionable pods. Compute new nodes to fit pods. So he found a pending pod, then it launched a new node here and then it discovered the security group for my node. It discovered the specific Kubernetes version discovered the AMI create a launch template and launch the instance which is this instance that you see here. And if you now look at the Kubectl get nodes, you see that I have a new node which is the one down below here that ends with 145, that has one pod. So if you see here kubectl get pods o wide and we see that this pod for inflate is running on that 145 instance. But what happened? If I want to scale this specific, let's say I want to scale this deployment a little bit more, let's say I want you scale this deployment, you have ten replicas, what will actually happen? So if I go and I do a deployment, okay, I need ten replicas, right? So you see that it has a schedule five nodes here and now he said well ten replicas won't fit in this r four x large, it just won't, right? So what in this case will happen, Karpenter will say okay, I need a new instance and it looks for whatever capacity he had available that would fit the requirements. So always remember it looks for performance and cost and it has a spin up c xlarge so it is spinning up and we can actually see Kubectl get pods, we can see the status of those pods. So I have one, two, three nodes that were running that actually fitted here. The other two pods that are running on this specific node are demon sets, the AWS node and cube proxy that you need to deploy. Kubernetes will deploy automatically but that needs to be deployed on those instance, on every single instance. So now it's ready and we can see if we do get nodes again off then are running and I have nine nodes running right again it has all the seven pods for inflate and the two demon sets that is required for my pod to run. But if now if I go and I finally let's scale this to zero. So now I want to see how capitol behaves to remove. So remember I don't have consolidation here enabled but after 30 seconds that my nodes are empty and you see there are two pods but those two pods are demon sets. And you can see here that if we do Kubectl get pods a and we look for all the pods for example that are on 145. So let's just do apologies, let's just bring this up again and if we do show white, if you look for all the pods that are running, for example the one 9145, I mean they were gone, you see now they are gone right? Like carpenter already removed. But if you look before 9145 you see here that this was qproxy and 9145 this was AWs node which is just demon sets that are running. But you saw how carpenter can works. So just because I'm running out of time here, one thing I actually want to do is I want to enable consolidation. So I'm going to replace my default provisioner. So I'm going to replace my default provisioner. So what you see on my default provisioning here now I'm telling consolidation is true. So you can see here consolidation true. And then 1 second just my computer is a little bit slow. I'm saying that I only want on demand now I don't want spot and I don't want these specific instance sizes and that's pretty much it. And remember once you have consolidation you need to remove the TTL seconds until seconds after empty. So I don't need that. So what I want to do now let's deploy three replicas of my inflate application here. So remember now I have consolidation enabled. Let's see what is the difference. Right. Okay, I have said okay, spin up three more replicas. Karpenter said okay, in order to spin up three more replicas, now it's on demand because my provisioner said it was on demand and not using spot anymore. He found that this specific instance is the most performance and cost optimized for. What I'm trying to do is now spinning up those. So we're just going to wait for those to spin up. Okay. Remember the idea of actually having beam packing and fast rescaling cluster scaling is going to be much slower than this. You could see that it's actually pre pulling the image and making those cached available just when the node is up and running. So it's doing already some work before even the node is ready on kubernetes. And you can see those details here. And off he goes. They are actually available. So if we quickly do kubectl get pods, I can see that I have three pods. And if you do, you can see that I have three pods. Let's just wait a second. You can see that I have three nodes that are schedule 132 which is the new one. So now finally let's deploy, let's actually go and do the same. Let's do, okay, I want ten replicas. Okay, now I want ten replicas. Let's see what carpenter does. Right, okay. Carpenter said I cannot fit all the remaining seven nodes that you want me to deploy on the existing infrastructure. So now I am deploying another c, two a. They are two x large, also on demand. So let's wait and see what happens there. It's taking a few minutes to actually a minute or so to scale to create a new instance and then deploy those containers on that new instance. But the goal I'm trying to do, you can see that before it's even ready on the Kubernetes, Carpenter is already making all the work behind the scenes. And you can see all the logs here. If you look at the logs. Let's see if we get the logs here. It's just going to 1 second here. So you can see the logs that it has deployed. A new instance and they are all available. So if you do Kubectl get pause a and we see all the pods and you see all the inflates. Now, I have pods on 106, but I also have pods on 132, which is 106 and 32, the one I have deployed. So what should happen if I actually now scale my application down to six? Remember what? Consolidation does not only remove empty nodes, but also try to make good decisions if we can do it. So I'm going to spin up to six replicas now instead of Santa one, six replicas. So you see that, okay, it has now removed to 40% utilization. So let's see if Carpenter will make any decision here. It might make a decision here or it might not. So let's just wait a few minutes. So he made a decision. He said, okay, the remaining pods that you had, the six replicas that you had, could actually all be filled within the chew xlarge. So we remove my x large instance before removed, of course, moved all the pods into the existing node here. Because he saw like, well, you shouldn't have two. You don't need to have two instance. You can only have one instance, keep the bigger instance, move the nodes to the bigger instance, and then remove the smaller instance. So we see that. But what happens now if we go a step above, we have this two x large. What happens if you only need now three replicas? Right? So you only need three replicas. And you can see now that only 40% of the two x large is being utilized. Carpenter should actually go and look for a cheaper instance that can fit those spots. So let's just wait. Oh, actually what this is going to happen here is because. Yeah, there you go. Because you can see that it's now cordon. It's waiting for a new instance. That's xlarge to actually to be available because much cheaper than two xlarge. Right? So it's waiting for a new instance to be up and running. Once that instance is rampant running, it's going to move the pods from the bigger instance to the smaller instance. And once those pods are moved and ready and running, it's going to remove the bigger instance. So long story short, carpenter will always be looking for you for the best performance and cost optimized way and you can create many different things, multiple provisioning that will fit specific needs for your application. But hopefully I was able to demonstrate once this finished. So you see this is ready and it actually hasn't removed the two xlarge. And if I look at the pods running here and we look once, just a few seconds, when my page refreshes here, you see that all the inflate pods are actually now running on my specific 123 instance. So you can see 123 instances they are up and running. So I just want to say thank you so much. Hopefully the demo was useful. Please reach me out on Twitter and LinkedIn if you have any questions. Go ahead and test carpenter. I would highly recommend running carpenter on eks. It can make your life much easier, more flexible and more cost optimized. So hope you had fun. Thanks for tuning in and have a great rest of your conference. Bye bye everyone.
...

Samuel Baruffi

Senior Global Solutions Architect @ AWS

Samuel Baruffi's LinkedIn account Samuel Baruffi's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways