Conf42 Golang 2024 - Online

Developing custom Load Balancer using Envoy

Abstract

I will be talking about developing a custom Load Balancer using Envoy and Go:

  • Concepts of Load Balancing
  • Components in Envoy and their responsibilities
  • Configuring load balancing using GRPC
  • WASM support in Envoy and it’s usages and implementation examples
  • Conclude with working load balancer

Summary

  • Sandip Pat will be talking about developing a custom load balancer using go and envoy that is both scalable as well as fault. Currently I work as a staff software engineer at Harness. Harness is a company that operates in the DevOps space and is focused on cloud cost optimization.
  • load balancing is a key concept in a distributed computing. It is primarily about routing incoming traffic across multiple different application servers. Key features of load balancing would be high reliability and availability and the flexibility to scale.
  • Envoy is a reverse proxy that operates in the layer seven. It has a filter based mechanism wherein you can change multiple filters. The primary goal is to distribute traffic amongst multiple back end targets. It should be cloud agnostic in the sense that we should be able to easily run our custom load balancer across multiple different different domains.
  • We are able to bring up our custom load balancer using cloud init. We can distribute traffic among multiple different back end targets by orchestrating one and dynamically without any downtime. And we can be cloud agnostic by using cloudinit and in terms of scalability customization.
  • Thus, I would like to conclude my demo. Hope you had as much fun as I had while doing this. Thanks for listening.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hello everyone, I am Sandip Bhat. Today I will be talking about developing a custom load balancer using go and envoy that is both scalable as well as fault. All. Before we proceed further, I would like to quickly introduce myself. I have eight plus years of experience in the industry. Prior to my current role, I used to work at companies like Walmart, C Square, Packer, Enterprise. Currently I work as a staff software engineer at Harness. Harness is a company that operates in the DevOps space and I'm part of a team that is focused on cloud cost optimization. As a result of this, I have got exposure to multiple different cloud products like AWS GCP. Beyond work well, I love traveling across the world as well as reading upon different tech and exploring new technologies. How are we going to go about doing this talk right? So we will start by discussing some of the basic concepts of load balancing. We will then see what are the different cloud native options that we have post that we will discuss about envoy. What are, what is envoy and what are some of the key features of envoy? Some of the components of envoy that we will be utilizing in our custom load balancer. You'll then see what are the different features that we would want to target using our custom load balancer and we will discuss about the architecture of the components of a custom load balancer. Then towards the end of the talk, we'll also see a working demo of a custom load balance. So what do you mean by load balancing? Load balancing is a key concept in a distributed computing where scalability, reliability as well as fault tolerance is is essential. At the core of it, load balancing is primarily about routing incoming traffic across multiple different application servers, ensuring that they are not overwhelmed with requests and also optimal usage of the resources. So as you can see in this image, you have multiple different users trying to access particular service across mean over the Internet. In this case, you can see a load balancer sitting in between the the users as well as the application servers routing traffic across multiple different applications. Some of the key features of load balancing would be high reliability and availability and the flexibility to scale. So in this case you can the application servers can scale by adding more servers and thus achieve what you call as horizontal scaling. And the performance of the applications are thus improved when traffic is routed or distributed evenly across multiple different applications. Some of the key load balancing algorithms or commonly seen load balancing algorithms are round robin based load balancing or weighted round robin based load balancing as well as least connection based load balancing. So as me, this is a simple example of load balancing. As you can see on the left side of the screen like you have multiple clients trying to access a particular service over Internet. And the load balancer sees that both applications have healthy and routes or distributes traffic across both of them evenly. And on the right you can see that one of the application server has gone down. Now the load balancer recognizes that and routes all the traffic to application server. Thus the clients do not see any difference. They do not know this anything. So talking about the cloud native options, what are the cloud native options that we have? AWS has its own offering called AWCLB, Azure has app Gateway and GCP has its own. In this case, we are primarily discussing about layer seven load balancing. We are not going to be discussing about layer four load balancing in this case. I mean that the core idea of load balancing involves three major components, or you would have incoming traffic that is being identified in the or processed on using rules that define how they have to be acted upon. And then you have target groups or logically grouped application servers which are called as target groups, right, which would basically be serving all the, all those incoming traffic. Moving on. What is, what do you mean by Envoy? Envoy is a CNCF graduated project. It evolved out of lib and primarily Envoy is written using c and Envoy is a reverse proxy that remind operates in the layer seven. And it's pretty extensible in the sense that it's a, it has a filter based mechanism wherein you can change multiple filters, more like middleware senior in any API servers, changing them you can, you can customize onward to your own needs. And as you can see with the commits, S and s stars, it's a pretty popular topic. So what are some of the key features of Envoy service? Discovery load balancing, and I mean checking wherein you are able, you are able to take certain vms out of rotation when they are unhealthy. Security Envoy, there is a lot of features around security observability, wherein you are able to track the different metrics around the number of requests served and other aspects around observability, rate limiting, where you are able to ensure that your backend servers are not overwhelmed by limiting the number of requests in any given threshold. And it's pretty extensive. As we spoke before, you can, you can use multiple different filters to customize envoy. So what are some of the key components of envoy that we would be using in our custom load balancer? Well, these are some of the key components of envoy. Listener filters, clusters, secrets and upstream. Looking at this image here, you see port 443 and port 80. These are these are example for a listener wherein our load balancer would listen on port 80 as well as port 443 for incoming requests. And there we could then move the move to set of chains, filter chains. These filter chains would include things like domain matching. I mean custom custom components around. I mean there could be lower plugins that that you can use to track incoming requests as less maybe log them or act on the packet. Once these rules are applied on the incoming request, they're routed to the appropriate cluster. The cluster would is nothing but multiple upstreams group of upstreams. An upstream is is a single vm as an example. It could be a particular single vm or virtual machine that that that serves your application. A logical grouping of these upstreams would be a cluster for a given given domain, for example, or a given path in a domain. And secrets would be used for managing the handling the certificates. This is a sample configuration of envoy working example of it wherein in this case we will be routing supporting traffic coming on port 80. And I mean in the domain that we are not restricting ourselves to any domain in this, which means that any traffic coming on port 80 would adult route path would be routed to the cluster called some service in this case cluster named some service. So, and the cluster, some service in this case is pointing to particular IP address on port 80 with, with health check different, as seen here, with a timeout of 2 seconds and check being performed at every interval of 5 seconds. And field health check would mean that it's taken out of rotation. So what are the requirements of our custom load balancer? Some of them mean that you can, you could possibly add more features or more features that we want to, we could possibly support using our custom load balancer. But to begin with, we look at some basic features that we want to support. The primary one would be to distribute traffic amongst multiple different back end targets. Our custom load balancer should not be limited to any given domain. We should support supporting multiple domains. We should have one load balancer that can route traffic to multiple different domains and pass. It should support health checking. That way we would be able to take out the application servers that are unhealthy or it should be cloud agnostic in the sense that we, we should be able to run our custom load balancer or easily port it across multiple different. And we are looking at celebrity customization. I mean, we will see how we are going through. So this is, this is the design of a custom load balancer. So we have some different components. As we can see here, we have a virtual machine which will be, which will be the custom load balancer that we have. So the idea is we would have a VM in which we will run our load balancer and our VM. This VM will behave as the custom load balancer. And by making it cloud agnostic, you can easily run this vm anywhere. You could run it on AWS, TCP or even azure. And as you can see here we have our custom load balancer running within a vm, but it will be interacting with some other components which are outside the vm, like in this case an API server as well as a database. So there will be a database which would basically store the configuration of our load balancer like the domain, the incoming port, outgoing port, the IP addresses of the vms, application vms as well as the, I mean even certificates, details related to certificates, etcetera. So we will have an API server that would fetch this data from the database and be interacting with our load balancer, which basically be, will be fetching this configuration and passing it and translating that into something that envoy can understand. So as you see here, we would have VM that would have two services that can be run as Linux system CTL services, for example like we would have envoy running inside the vm and we will have a control plane, custom control plane return is info that would be communicating with envoy in the form of JFC communications. As we discussed before, envoy supports service discovery and it's going to do, we are going to do that using, we also have, we'll also be using cloud. Initially we will come to that. So if you mean this is how a sample envoy service looks like, you would, I mean you would have the definition as in, as in like aspect of this service configuration that we are interested in would be the configuration that we will be passing to envoy when it boots up. As you can see here, we would be passing a particular startup config that Envoy would boot with. That way it knows how to interact with our, this is the configuration that we would want to bring up envoy and these are some of the service discovery components of envoy that we would want to. In this case LDS config would be listener Discovery service and CDs would be cluster discovery service. So our envoy would boot with these two discovery services enabled. And as you can see here, it would be using GRPC to communicate and discover the services. And in this case we are going to tell our envoy that they should be communicating with XDS cluster, something that we call as XT's cluster here for fetching the configuration of envoy and be able to update dynamically without any downtime. What we mean to say is if ever there were to ever add a new domain or remove a domain from our load balancing configuration, we would not want to restart on or have any downtime to update the configuration. That would happen dynamically. And in this case our XD's cluster or our control plane would be running on port 18,000 in the same 127.01s you can see here. And it can run on any port. As an example here I can put it into. So our control plane can look something like this. It can be pretty minimal in the sense that you have a go routine that would be running the envoy server server that would basically communicate with envoy through GRPC and sharing the configuration that envoy has to update itself. And then we also have a sync server basically that would periodically talk to the, talk to the API server and fetch the latest configuration. I mean in this case we would be polling our API server, but you can, we can even use websockets to improve the performance. And this is our, this is the JPC communication with the, with envoy. Basically the JPC control plan wherein it will be updating the envoy. This is our scene server. That would mean it's a basic scene server where we have a for loop that's running on loop and we have a select construct that basically uses a timer to periodically sync and get the configuration. And look, I'm talking about configuration. This is the basic entry in our database. Our DB could, would have a JSON V field, like if you have to use postgres wherein you have some of these fields. As you can see, the domain that you want to support would be Sandeepbud code. And these are the different vms that are going to host this particular domain. And the incoming port and outgoing port would be port 80 flights. The request would be coming on port 80 and the vms would be supporting the same in the port 80 as well. And we have high checks, different as is. As you can see here basically can have multiple reports like this for different domains that would be passed by our load balancer to so this traffic. So in terms of packaging the cloud init, cloud init is a key component wherein it's a initialization system or a package installation system. It supports, supports writing custom scripts and it's pretty cloud independent as in you can, you can have cloud unit in AWS, TCP and Azure. Particular cloud in it script would be the same script would be supported across multiple different products. So what this helps us do is like we are able to bring up our custom load balancer using cloud init. You would have a cloud init script which would run whenever the system boots up and be able to fetch envoy and install envoy as a service in the system as well as bring up download our binary of the control plane that we designed and get them up and running as a Linux service. That way you are able to package them together. A sample cloud init script can look something like this. The key thing to notice here would be the scripts user. What we mean to say is we would always want our cloud init script to run whenever the system and with that we can even support updating our system as and when we have a newer version of our control. The script will go towards the end of the loading script. With this we are able to now we have a custom load balancer that basically edges all these aspects that we see here. We are able to distribute traffic among multiple different back end targets by orchestrating one and dynamically without any downtime. And we support multiple different domains checking. And we are able to be cloud agnostic by using cloud init and in terms of scalability customization. As we saw in our design, we can run our custom load balancer on any virtual machine, which means that if you had to run it on say AWS, you can run our load balancer on machine with a lower spec like say t two small or t two medium. Or you could even run it on a bigger machine like say t two x large or four x large based on your needs or based on the traffic needs. Yeah, so I mean I would also like to do a bit of cost comparison across AWS LB as well as our customer balance rest and as to why we would want to use our custom load balancer and how it's beneficial. AWS ALB has early pricing and it also has pricing or cost based on different aspects like number of connections, bytes processed and even that two connections as well as the rules processed. And if you have to bring that up comparatively, I mean it can look something like close to $16 per. I mean I'm not even taking into consideration the cost of traffic or data processed. I'm only looking at the basic cost components here. And if you have to look at a custom load balancer, it has early pricing per instance, right? The cost of running a VM only pay for that. And if you are to run a smaller vm, let's say smaller configuration, exactly to medium or even smaller, you would end up paying lower amount. And at the same time you have the if you have to compare the cost of that, say running our load balancer and d two micro, it would be close to around $8 per month. Digital micro we have seen pretty capable of handling traffic to a decently good enough scale. Now let's, let's take a look at our demo. As you can see here, I have three freight terminals here. The one here is running on Oi, one here is running the API server, running my API server here, and this one out here is running the control plane. So ideally in our load balancer, in any deployed in any VM virtual machine would have these two components in it. The one in the top as well as the one in the bottom. And this one, the APS are, will be running outside the system and or control plane would be configured to talk to the APC. And I also modified the etc host file of my system to the domain sandeepbud.com to my, my own system localhost in space to aid with this demo. So going back to the demo you see here, right, I have two vms that are running in AWS that are running nothing but basic Nginx server with a custom HTML file. It basically prints the IP address of the. And if you see here, currently both of the vms are running and I'm pointing to my domain Sandeep. So what would happen is when, when we hit sandipad.com in the go, as per our etcetera file, it's pointing to the same system as our system 127.1 in this case, and on port 80, which, which is how our load balancer is configured now, wherein we are saying, okay, as you can see now in a previous slide here, our domain, the domain here is different, but we have configured it pretty, pretty similar to this one, wherein we are able to route any traffic coming on port 80 is routed to port 80 on the target. So when we hit sandybud.com here, it's routed to port 18 in the same system wherein we have our system listening to it and it routes the traffic to the target vms, the demo vms one and two. And we'll see the IP address of the demo vms being printed. So as you can see here, it's routing traffic evenly across both of them. Now, if you have to bring down one of the vms, we should pretty, pretty soon see our load balancer routing all the traffic to the other V. Right. The SJ came into impact and it's able to identify that the target VM is another VMS is. And we keep seeing all the requests go to the same. Thus, I would like to conclude my demo. Hope you had as much fun as I had while doing this. Doing this talk. Thanks. Thanks for listening.
...

Sandeep Bhat

Staff Software Engineer @ Harness

Sandeep Bhat's LinkedIn account Sandeep Bhat's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways