Conf42 Kube Native 2023 - Online

Closing the Developer Experience Gap of Kubernetes Development

Video size:

Abstract

K8s development teams should have a developer experience that allows them to focus on the things that matter (e.g coding, testing, iterating) instead of things that don’t (e.g waiting for the build/push/test cycle to be completed). This talk explains how to close the Developer Experience Gap of K8s

Summary

  • Edidiong Asikpo talks on how to close the developer experience gap of kubernetes. He says adoption of containerization has solved many challenges that businesses face today. This has motivated several people to adopt cloud native technologies.
  • Developer experience is the workflow a developer uses to develop, test, deploy and release software. The adoption of cloud native technologies has altered this developer experience in two ways. A slow inner dev loop impacts everyone.
  • Telepresence is a tool that enables teams to test and debug on Kubernetes. It does this by connecting your local machine to a cluster via a two way proxy mechanism. There are two ways to intercept traffic with telepresence, one is called global intercept, the other is called personal intercept.
  • Telepresence gives you that all round developer experience as it bridges the gap between local development and clusters. If you'd love to try telepresence, you can do so by visiting this link. We currently have a 30 day free trial.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hello everyone, welcome to my talk on how to close the developer experience gap of kubernetes. My name is Edidiong Asikpo and I currently work as a senior developer advocate at Ambassador Labs. I'm also a CNCF ambassador and technical content creator. You'd oftentimes find me building, writing, and sharing knowledge with people in the developer community. You I go by DD codes on all my social media platforms except LinkedIn, which is my first and last name. Edidiong Asikpo. So if you think about it on a general sense, right, the adoption of kubernetes and containerization has solved many challenges that businesses face today when it comes to flexibility, scaling, and the reliability of the release of new versions. This has motivated several people to adopt cloud native technologies because they also wanted to enjoy these benefits. Right? And that's why you've seen that so many companies have transitioned from monoliths to microservices, and you're still more in the conversation of making that transition as well. So even though kubernetes enables you to achieve all these amazing things, what nobody really tells you is that it significantly impacts the developer experience we once knew about. So I'm sure you probably know the meaning of developer experience, but let me just quickly remind you about what that means and also lead up to the next point in this conversation. So developer experience is the workflow a developer uses to develop, test, deploy and release software. So it's pretty much what happens from the time they start writing code to when they push their code to production. This developer experience consists of two types, which is the inner development loop and the outer development loop. So the inner development loop is where the developer pretty much does like the build push test cycle where they're writing the code they are building to confirm that the code works as expected, and then of course testing it to finalize that confirmation or test process. And once they feel satisfied in this process, they push it to a version control, like GitHub for instance. So the moment things push is made, that's what automatically triggers the outer dev loop. So the outer dev loop is everything that happens leading up to when it's being pushed to production. This could be like code merge, Canary release deploying, and all of those other interesting stuff. So the adoption of cloud native technologies has altered this developer experience in two ways. One is that developers now have to perform extra steps in the inner development loop, and secondly, developers now not to be more involved in the outer dev loop, even though most of their time is spent on the inner dev loop, where they are actually writing the code and testing the impact of their code changes. They now have to be concerned about, hey, Canary release, deploying code merge, all of those other stuff. And even though this comes with certain benefits, it also has its disadvantages. For this talk, we're going to focus on the inner dev loop, because that's actually where the debugging and developer experience happens in, right? Once you can make this section of the inner dev loop as fast as possible, it indirectly affects every other part of the developer experience, because it would make you to ship products to your end users a lot faster. So here's what a traditional inner dev loop looks like, right? Let's assume that the developer in this case would have to spend 6 hours per day writing code, and the inner dev loop would probably be like five minutes, for instance. This means that they'll spend three minutes coding, one minutes building and reloading, right? The next 1 minute used to inspect that the code changes that they have made is working as expected, and like to say ten to 20 seconds committing that code change to version control system. And if you want to count this or kind of break it down, you realize that in this formula that we've created, that developer would be able to make at least 70 iterations of their code per day. And the only inner dev loop, or like the only developer tax, rather, that they will pay here, is the commit time, which is actually negligible because it just takes ten to 20 seconds or less, depending on how detailed you want that commit message to be. Right? But then here's what the inner dev loop of a containerized system looks like when you start adopting kubernetes and other cloud native technologies. Yes, coding still remains the same, but then after you've made that code changing, you now have to wait for your code to be containerized, pushed through a registry, and deployed to a Kubernetes cluster before you can see the impact of your close change. And then you realize that this automatically reduces the number of iterations from 70 to 40, right? And then the developer tax being paid here is in this build push test cycle, which as you can see, is longer than the traditional inner dev loop, where you just spend like ten to 20 seconds committing. And oftentimes people would just neglect this and be like, oh, fine, it's great, let me go grab a cup of coffee. Let me quickly watch like a Netflix episode while my code is containerizing. But then you realize that in the long run, that all of things, minutes that you're waiting for the code to containerize or be pushed to a registry or be deployed into a Kubernetes cluster could have actually been used to do the most important things that developers are expected to do, which is writing code, seeing the impact of their code changes and pushing it to production as soon as possible. So with a better inner dev loop, it means that you'll be able to move faster. And I will continue to explain that as we go further in this talk. So a slow inner dev loop impacts everyone. For instance, front end developers now have to wait for previews of backend changes on a shared dev environment or rely on mock mock databases, mock API scripts, those kind of stuff when coding the application locally. Backend developers, on the other hand, now have to wait for CI CD to build and deploy their apps to a target environment to verify that their code works correctly. And this doesn't just affect front end and back end developers, it affects everybody at large because it slows down the releases into production, which thereby impacts the business because you're not moving as fast as possible, and the end users because let's say they are stuck in a bug that cannot be fixed immediately because of the slow inner dev loop that the developers are currently experiencing in the company. So this bad developer experience doesn't just affect the developer experience, but the user experience and the company. So is there a way out? Is there a way to actually enjoy all of the benefits that Kubernetes has to offer without actually being slowed down or impacting your developer experience? The answer is yes, thankfully. And there are a couple of ways to do this, right? The first one here is where you get to run everything locally, right? And the good thing about this development environment is that you still get to enjoy all the benefits of local development. You can set breakpoints, enable hots, reloading and even see logs a lot faster. And because everything is running locally, it means that you'd have a faster inner dev loop, right? Because as soon as you make a code change, you can quickly test it against its dependencies. Another great thing about this is that it's also relatively cheap, right? Because you don't have to spend money on Kubernetes clusters. I mean, we all know how expensive that can be, but then it has a really high maintenance scale, right? You'd always have to confirm that the mock API scripts are up to date. Whenever I want to make a code change and think of things in like a situation where there are several developers in your company all making changes respectively when they are working, it can be really tough to ensure that this mock API script is up to date. And that's why you oftentimes see companies who use this method push things to production and realize that there is a mistake that they missed out on because their mock SPI script wasn't as up to date as it should have been. And then because everything is running locally, it also makes your workstation really hot. At some point you'd have to move away from this because there's only so much that your laptop can actually handle. So even though this method has several benefits like a fast feedback loop, it's cheap, gives you access to local development tools. The high maintenance and hot workstation isn't sustainable. So the other option here is to now try remote developments right, where everything runs remotely. And because everything runs remotely, you now have a normal workstation, right? There is no hot, your laptop is not closing up, it's not becoming too hot, you can use it as expected. And the maintenance is also really low because you can set CI CD systems to ensure that every single time someone makes a code change, it updates it all around. And whenever another developer wants to make or test the impact of their code changes, they are using the most updated API script or database or whatever dependencies that they are testing against. But then the cost is really high, right? Every developer would have to use their own remote development cluster, which can be quite expensive. And then the inner dev loop or the feedback loop in this case is extremely, very slow. Because every single time a developer makes a code change locally and wants to test the impact of that code change, they need to containerize it, push it to a registry like Docker hub for instance, and then deploy it into the remote Kubernetes cluster. So doing this all over again every single time you want to make a code change slows you down and makes you less productive. So instead of you to do like 70 iterations of your code per day, you don't end up doing 40 iterations per day. So even though this development environment has great benefits, like a normal work temperature, low maintenance, the cost is very high, it has a slow feedback loop, and you get to lose out on all the many benefits you enjoyed from local developments like debugging breakpoints, hot reloading and all those other interesting stuff. So you'd agree with me that these two different development methods have their own benefits. Local development environment has several pros and remote development environment also has its own pros. So how can we combine these two things together and create a development environment where you get to enjoy the benefits of local development and the benefits of remote development? This is where telepresence comes in and enables you to achieve this. So this creates a development environment we call remocal, which is remote to local emerges and gives you the best of both worlds. So now your cost would be low because even though things are running in the remote Kubernetes cluster, development teams can now use shared clusters. So it means you cut down the amount of money you had to spend paying for clusters for each developer. And the maintenance here is very low because you can still use your CI CD systems to automate and update your API, script, database and all other dependencies whenever a code change is being made. And then the temperature is normal, because the only thing you have to run locally is the service you're actually making changes to, while every other thing would run remotely. And then things gives you a fast feedback loop because you no longer have to do the bullpoost test cycle, right. All you need to do is run telepresence intercept and you instantly be able to intercept the traffic going to the service in the cluster, to the service on your local machine. That way you can test how this service would work with its dependencies in the remote Kubernetes cluster. So what exactly is telepresence, you might ask? Telepresence is a CNCF tool that enables teams to test and debug on Kubernetes a lot faster and in a seamless process. It does this by connecting your local machine to a cluster via a two way proxy mechanism, which enables you to access clusters resources as if they were running locally, and reroute cluster's traffic to your local service. There are two ways to intercept traffic with telepresence. First, one is called global intercept, while the other is called personal intercept. So what global Intercept does is that it intercepts the traffic that was intended for a service in the remote cluster to a service running on your local machine. All of the traffic, right? But personal intercepts, on the other hand, only intercept a subset of the traffic. And this is vital because there are certain times where different developers are working on the same Kubernetes cluster, right? And you don't want to make your debugging or testing to affect or impact the work that the other developer is doing. So in this case, you'd only send a subset of the traffic to just your laptop, while every other request coming to that service in the cluster would go there as intended. So here is a diagrammatical architecture of how telepresence works, right? So let's say you have a service called service, a prime, running locally in your computer and another service called service a running in the cluster. So whenever a request comes in through the ingress, it's going to hit the sidecar agent, which has been added by telepresent. And once it hits this sidecar agent, it's going to direct all the traffic coming here to the traffic manager, who would then reroute it to your laptop. And of course that's in the case of a global intercept. But if this was a personal intercept, once the request comes in through the ingress and hits the sidecar agent, it's going to check and say, hey, does this request has the HTTP header? That was set when I ran the telepresence intercept command, and if it does, it sends it to the traffic manager, which then sends it to your laptop. But if it doesn't, it goes to service a as expected. So this is in a scenario where your work doesn't impact other developers. Assuming another developer is still using service a to do some testing of some work, I'm not going to impact the traffic coming to that service in the cluster, just a subset of it that will come to my laptop so I can do my debugging and my testing and have a fast developer experience without impacting what my colleagues are doing as well. Let me show you a demo of how telepresence works. The first thing I'm going to do here is run the telepresence Connect command. This is going to put my local mission in the cluster and enable me to speak to cluster resources as if I was another resource in the cluster. Let's start this out by assessing one of the services running in the cluster. For instance, the very large Java service. In this case, I don't have to put in the ip address, I can just put in the DNS name because telepresence has merged my local ip routing table and DNS resolution with the cluster, making it possible for me to connect to it using the cluster's DNS name. I'm now speaking to the very large Java service like I'm inside the cluster without having to proxy in or do any other complex configuration, thanks to the two way proxy mechanism that telepresence has set up between my local machine and the cluster. Let's talk a bit about this demo. This demo is called Edgycop, right? And the demo has number of services. The data processing service is the one I own and I'm actively developing, right? Well, the very large Java service is too large for me to run locally. It's owned by another team and I also do not have access to its code. The very large data store, on the other hand, has all the critical scenarios, right? All the critical test scenarios, and it's also too large for me to run locally and dates back to the creation of the Edgicop application many years ago. So without numerous configuration or time wastage, telepresence connect enables me to assess and interact with the very large data store and the very large Java service without running them locally. I can instantly assess these services in the cluster. So aside from being able to assess cluster resources as if they're running locally using telepresence connect, we're also going to run the telepresence intercept command. Like I mentioned earlier, there are two types, which are the global intercept and the personal intercept. So if you look at the edgycop web app now, right, you see that the UI color here is set to green, and you also see that the data processing service is also set to green. And there's no other information here apart from data processing service, right? So I have this data processing service, like I have a local version running on my computer. I've not started running here, but I'm going to do that now. So I'm just going to go, I've already navigated to the cluster, so if I type Python three app py, it should start up the server of the application on localhost 3000. All right, awesome. So if I try to call that localhost 3000 color, you'd see that it returns blue and then the call also went through successfully. That's why you can see the 200 here. So now what I'm going to do is I'm going to intercept the data processing service in the cluster and reroute all the traffic going to this service to the data processing service running on my local machine. To do this, I'll run the telepresence intercept command. So telepresence intercepts, intercepts the data processing service in this case, and then deport, which, as you can see here, is 3000. So if I do this, it's going to create that intercept for me. See that intercepts name, data processing service is intercepting all of the requests. So if I go to the application again and reload, you see that instantly, it's now assessing the service on my local machine. And if you come here, you'd also see that that call has also gone through successfully. Right. You can also notice that the color has changed from green to blue and that the content of this data processing box is no longer the same, showing you that it's now accessing that local data processing service here. Right? So now you can see that I was able to move all of the traffic intended for data processing service, the cluster to data processing service running on my local machine. So what I want to do now here is create a personal intercept. To do that I'm going to leave this existing global intercept and I'll do that by running telepresence leave and the name of the intercept, which in this case is data processing service that has left. So if I go back here and try to reload this page, you see that it's back to receiving traffic, or rather sending traffic to the service in the cluster. I'm going to run the telepresence fascinator set command. So telepresence intercept, I'm going to add the HTTP header and then I'm going to pass the pod, which is about 3000. And at the name of the service I want to intercept, which is data processing service. So this should create that personal intercept for us. All right, so what do we do here is copy this key value pair. If we go back to our browser and try to reload this page, you see that it's still going to show the green colors because now unlike when we ran the global intercept, it's not rerouting all of the traffic. So we routing a subset of the traffic that has the HTTP header. So if I go to my mod extension and paste in that, I already had this already. So in your case you just click on this plus sign and add the key values here. Since I already have, I'm just going to click on check and then I'm going to reload this browser. And as you can see here, I've instantly been able to reroute the traffic to only a specific header. Not every single request come in, right. So we can combine these two things together and be able to instantly get the feedback loop when we make a code change. So for instance, let's say I want to change that color from blue to say orange for instance. Let me go to data processing service and change this from blue to orange. So I'll save this. And you see that as soon as I save that, of course the server gets restarted because a change has been made. And if I come back and reload this, you'd see that the color has also changed automatically. So instead of going through the build post test cycle like you normally would, or having to run all of your services locally, even having to set up ports forwarding, you can use telepresence to do all these amazing things instantly. So that's like the beauty of it. And then if you think about it, the faster you can move, faster I can do. All of this process is what makes your developer experience great. Instead of having to wait for minutes, sometimes even hours, depending on how your Internet connection is or how powerful your laptop is, for your service to be containerized can use telepresence to speed that feedback loop. Get a better developer experience, which would in turn help not just you, but entire company and end users at large. Awesome. Now that you've seen how telepresence works, here's a quick example of a company that utilized telepresence and I'm going to explain their before and their after. Without telepresence, they didn't have a great developer experience. But according to them, after they started using telepresence, their developer experience improved drastically. Before telepresence they had to bear the operational and resource burden of running all their microservices locally, but with telepresence that was completely removed. They only had to run the service they were updating locally and every other thing was running in the remote Kubernetes cluster and they could instantly see the feedback of their close changes. And then they moved from not being able to utilize both the benefits of local and remote development to being able to have the best of both worlds. So they could use local development tools for debugging hot, reloading, seeing logs faster. But then their laptops didn't have to go through stressed or become too hot and all of that because most of their dependent services were still running in the remote Kubernetes cluster. Also, they moved from having to code build the container, push it to a registry, deploy and wait before being able to test the impact of their code changes to just coding, intercepting and immediately testing and seeing the impact of their code changes. So wrapping up Kubernetes development teams actually need to have a developer experience that allows them to focus on things that matter, which is coding, testing, iterating instead of focusing on things that do not matter, like waiting for the build push test cycle to be completed or discovering that you have a bug in production because you've missed it due to having a testing environment that is unrealistic from the actual testing environment in production. So once this is done, it would significantly increase the productivity of the developers on the team and the number of updates being shipped to production. Telepresence gives you that all round developer experience as it bridges the gap between local development and clusters, giving you the best of both worlds to be able to utilize all of the interesting things you love about local development and remote development. So if you'd love to try telepresence, you can do so by visiting this link. We currently have a 30 day free trial, so you get to try it easy for your development, see how it improves your development workflow and your developer experience, and of course, invite members of your team to join in as well. Thank you so much for joining this talk and listening till the very end. If you have any questions, feel free to join our community. Slack at ah Slack or send me a DM on Twitter via Didi codes. Thank you.
...

Edidiong Asikpo

Developer Advocate @ Ambassador Labs

Edidiong Asikpo's LinkedIn account Edidiong Asikpo's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways