Conf42 Cloud Native 2024 - Online

Architecting Resilient Microservices: A Deep Dive into Building a Service Mesh with Envoy

Video size:

Abstract

Join my talk and learn the ins and outs of building fault-tolerant, scalable microservices architectures, ensuring your applications stand resilient in the face of challenges. Elevate your understanding of service mesh dynamics and empower your development journey with the insights shared.

Summary

  • Service mesh is a popular solution for managing communication between individual microservices. Envoy proxy is a high performance proxy server that can be used to take care of these capabilities through a configuration. With these features and capabilities, service mesh improves the application's resilience and make it high performant.
  • A european government initiative aims to build a common framework and technical practices for the design of reusable and interoperable operable digital components. The proposed solution that we are suggesting is built around the envoy proxy.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Everyone, welcome to my talk on architecting resilient microservices. A deep dive into building service mesh with envoy before we talk and deep dive further into this topic, let's first understand the agenda for today's topic. So in today's talk I'm going to cover some of the key challenges with the microservices architecture. What is service mesh as a pattern and its core features? The envoy proxy. And we'll try to deep dive into a use case and a problem and try to devise a solution for it. We'll also see the benefits around using the service mesh with proxy as a pattern. So let's deep dive further. Now let's first understand what is service mesh? So service mesh is a popular solution for managing communication between individual microservices. In a complex microservices application or architecture it emerges as a very popular pattern for managing communications between individual microservices. It provides a dedicated infrastructure for handling service to service communication. So let's say you have number of microservices that are deployed as service A, B and C. With the service mesh as a pattern, the individual services need not have to take care of which service I'm going to call, how do I call that service and what are the endpoints. So all such configuration has been taken care and managed and handled by the proxy. So along with every microservice that you are deploying as a service, we will also have a proxy which is nothing but a simple YaML file or a configuration file which will take care of all these configurations and taking care of routing configuration filterings in a very dynamic manner. And that's how the service mesh pattern has been thought about. Now let's further see what are the capabilities the service mesh provides. So it provides lot of features, lot of abilities without implementing a service mesh. Otherwise any architectures will need to implement all these capabilities from scratch and it takes lot of effort as well as time. So let's deep dive into what are the capabilities the service mesh provides. So the first and the foremost is it provides distributed tracing. So any complex architectures typically doesn't stop only with a point to point or a single microservice calling another microservice. However, it's a chain of microservices that we typically call and also number of internal as well as external integrations that are expected to be taken care in a single workflow. When it comes to the various internal as well as external integrations. These systems and these components may be deployed into various different discrete systems and that's where the distributed tracing or the tracing comes. An important aspect. When there is an error, it is very difficult to identify an error in a very discrete and a distributed system and that's where this capability of service mesh will be very very helpful. Another capability it provides is traffic management. It enables sophisticated traffic management capabilities with the help of load balancing, routing and number of retries feature. It enhances the security by providing encryption authentication authorization mechanism as well as it helps you to protect the data in flight as well as at store. It also provides the ability to discover the services by maintaining a service catalog along with its configuration parameters to be sent while calling that service as well as the service endpoints, all ins all, it provides a very resilient architectures through the features like circuit breaking, fault injection and overall with these features and the capabilities, service mesh improves the application's resilience and make it high performant. Another important aspect is the policy enforcement. It allows for the enforcements of policies such that it can handle the rate limiting access control as well as traffic shaping for you now, while we have seen the capabilities and the features of service mesh, now let's try to understand what is envoy? So envoy is a high performance proxy server that can be used to take care of these capabilities through a configuration. So compared to the traditional proxies which struggles to keep pace with the dynamic nature of modern service communication, because any changes in the configuration you will have to restart the overall system and it will eventually cause a downtime. This problem gets avoided when we use envoy as a proxy. So that's the reason envoy proxy emerges as a groundbreaking solution to the challenges of service to service communication in modern architectures. This was originally developed by Lyft and now it's an integral part of CNCF. Envoy proxy offers a feature rich and high performance proxy layer, both for layer four as well as layer seven, and its key features include the dynamic service discovery, TLS termination, load balancing, building a resilient system and provide an extensibility through a rich set of APIs. Now let's understand how does it works, and in order to understand its capabilities and the functioning, there are four major configurations one should understand very, very carefully. So Ny proxy it manages. First of all, it manages the inbound as well as the outbound traffic flow using configurations which are made through first listeners. So the listeners are nothing but it defines a dedicated port at a dedicated IP address, and its job is to continue to listen for the incoming request or the traffic through with the help of various protocols like HTTP, HTTPS, TCP, et cetera, as an example. Now the next comes the routers. The outs define how envoy outs incoming requests to different upstream services based on the criteria such as request path headers and other metadata. The routes are configured in a routing table and determines which cluster to send the traffic to. So now, when the request comes to the cluster, cluster is nothing but a group of upstream services that envoy can forward traffic to. Each cluster define a set of endpoints that belong to the cluster. Envoy uses load balancing algorithm to distribute the traffic across endpoints within a cluster. Now, endpoints are nothing but the individual instances of a service that belongs to a cluster. They represent the destination for outbound traffic from the envoy. Envoy dynamically updates its list of endpoints as and when there is an update without the need of restarting your servers. So in short, when it comes to the inbound flow, the envoy first receives the incoming traffic and it intercepts it through the listener. It then processes the traffic according to the defined routes, performs various filtering based upon the filter chain that has been configured, and redirects the request to the appropriate cluster based upon the routing rule. Finally, envoy forwards the request to the corresponding endpoint within the cluster. Now, in case of an outbound flow, envoy receives the traffic from the upstream services based on the configured cluster and the corresponding endpoint. It uses load balancing algorithm to distribute the traffic across the available endpoints within a cluster. Envoy then forwards the traffic to the selected endpoint based on the load balancing decision. Now given we have seen the capabilities, features and the functioning of both service mesh as well as envoy proxy, let's deep dive into a use case and a problem statement that we will discuss today. So the use case, this use case, it belongs to, it's nothing but a european government initiative in order to accelerate the digital transformation and digitization of their government services to achieve their sustainable development goal. This initiative, it actually aims to build a common framework and technical practices for the design of reusable and interoperable operable digital components. The aim of building this solution ins so that we come up with the best practices and a common reusable framework that can be used across various government stack and it can be leveraged across various government stack for future use so that nobody has to reinvent or rebuild these common components from scratch. This has also been looked from the perspective to simplify the cost, time and the resource requirement necessarily required otherwise to create this digital platform. Now, the key technical ask of this use case is that they wanted to build a gateway for secure exchange of data and the services among various building blocks through open API based communication. The interfaces need to ensure that the interoperability and the implementation has been done as per the standards and the best practices have been incorporated. The next requirement is to enforce the best practices so as to bring the standardization in, data sharing policies and the data exchange now let's see the view about how is the communication between various building blocks of the government stack looks like. So the government has multiple business units, some of them taking care of citizen ID card, some of them are taking care of recording the childbirth, other units, the government initiatives taking care of the patient records and the registry and death record and registry. And there are various other building blocks as well. All in all, if you could just by name, it shows that all these various business units are interconnected and at any point in time they would need to share lot of information between them to maintain the consistency in the data and the records. So that's a requirement of this use case. Now, when it comes to the solution, what we can propose is we can propose the implementation, implementing the solution using service mesh with envoy proxy. So the proposed solution that we are suggesting is built around the envoy proxy. And in this solution we are actually focusing more about how to ensure a secure communication between the two building blocks of the government stack, as well as how to ensure that the data exchange is seamless. So the reason we are recommending envoy here is because of its dynamic configuration capability, the high performance and the extensive ecosystem of plugins that are available with envoy, which can be easily tailored as per the need. And most importantly, it ins an open source thing. So the high level architecture that has been projected here, it illustrates a couple of key components. The first one is the invoic proxy. It sits in the data path, intercepting any request between the services and the rooms which needs access control authentication, and then applying the access control and authentication rules. It dynamically evaluates the access information by evaluating a set of policy rules and provide a go or no go for the intercepted request. So multiple envoy proxies may be deployed within a security server for internal as well as external traffic. So this envoy proxy will sit at the front of each government building block, the ones we have said like the health registry, the childbirth information, the patient record. So in front of all these building blocks, there will be one envoy proxy which we are calling as a security server will sit and it will take care of all this authentication, authorization, secure data exchange, as well as policy informants. Whenever there are any updates in the configurations that have been made. Then in order to manage and control the overall ecosystem, we will also implement the envoy control plane. The implementation is required in a way that the services receive the policy challenges. Notification from the policy admin service and it gets propagated. It will be first received by the control plane and then it generates the access rule for the envoy proxy in a dynamic manner and push this update to the relevant envoy proxies, ensuring that the policies are enforced in a timely manner and at runtime without any delays plus without any restart required. So this service will use XDS protocol for the communication the authentication service that you see in the block is responsible for actual authentication of upstream and downstream services like the control plane service. This service receives notification and updates from the policy admin service. The service also sits in the data path as the authentication service that will be handled as a part of the request flow from the downstream applications and the proxies. So this is a solution that has been built using a service mesh pattern with envoy. Now let's see how and what are the different ways you can deploy the envoy proxy. So this is a deployment architecture and wherein you will see that the envoy proxy ins taking care of the internal communication as well as the external integrations. So there are various ways this proxy can be deployed. One, it can act as an envoy proxy, second, it can sit as a sidecar and third, it can also take care of external facing communications. So let's see how. So when it comes to the envoy proxy, it ins typically used for interproxy communications and it is deployed to control access from and to the classic applications which are running on VMs, the virtual service. And this will involve ensuring setting up routing so that all traffic has to pass through the proxy for secure communication and authentication taken. An alternative way of deploying it is the sidecar approach which you are seeing here, wherein all these applications that are deployed in Kubernetes will have envoy proxy deployed as a sidecar. The third approach is the external facing envoy proxy which will intercept the request to and fro coming from the external security server and ensure that its authentication is done. Now I just wanted to highlight a couple of points, why we have proposed this solution and what are the benefits of using this solution. So first of all, there are multiple challenges that we have seen without service mesh and without proxy while we build the microservices architectures. So when we propose a solution using service mesh with proxy, it automatically handles lot of boilerplate code which otherwise would take lot of time and effort to implement them from scratch. So first advantage is it makes your system very performant because the excess rules get encoded into the proxy configurations which are locally stored and enforced at the proxy as a gateway and hence there is no elapsed time. You are also saving on the network calls. Second, whenever you are pushing any dynamic configuration there ins no need to restart your system and hence it ensuring high availability of your system. Third, the solution works well both for cloud native as well as applications that are hosted on VMs. The solution is very future proof because when this proxy has been built, it already supported some of the modern protocols like HTTP two, GPRC and others, and it also provides a wide set of APIs as well as it provides a number of plugins which can be easily tailored and customizable to your needs. And it has got an inbuilt support for observability using the features like logging metrics and tracing. And that's the reason we have chosen this solution for the problem that we have just discussed. And with this we are at the end of our session. Thanks for watching my session. Feel free to ping me for any questions and the queries. Thank you so much.
...

Manik Kashikar

Google Cloud Practice Lead @ Thoughtworks

Manik Kashikar's LinkedIn account Manik Kashikar's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways