Conf42 Cloud Native 2022 - Online

Local Microservice Development with Remote Kubernetes Assist

Video size:

Abstract

As we took our SaaS platform from Alpha to Beta to GA, we accrued a rapidly expanding set of microservices. Engineers were unhappy with performance they were experiencing in local development, and tests on a single microservice became meaningless without the others.

This presentation walks through our experiences in building a scalable system that allowed our team to continue developing on their laptops as our platform grew. These include first implementing Docker Compose, then Kompose to move most of the workload to Kubernetes, and then Kotlin tooling to improve the flexibility of our local/Kubernetes hybrid development environment.

This talk will look at the tradeoffs of these different tools and the iterations that led us to where we are today. We will dive into how conference attendees can think about implementing these tools in their own environments to help engineers develop locally.

Summary

  • Conf 42 cloud native 2022 thank you for joining my presentation. My talk is entitled Local Microservice Development with remote kubernetes assist. At Stackhawk we build tools for test driven security. I should also mention that we are hiring.
  • When we started about three years ago, we wanted to build out a really state of the art platform. To help us in this build platform, we created a big bash script. All these stuff runs on AWS code. Build, just runs build steps.
  • We like building and running and debugging things locally and using profilers locally. So what we did is we found something called compose with a k. It can reuse your existing docker compose files and it can generate Kubernetes manifests based on those. And then it would set up Kubernets port forwarding to reach those microservices locally.
  • New Biodome command gives access to common build functions that we've abstracted out into libraries in ArI. It works directly against the Kubernetes and AWS APIs. It's opinionated, super simple to use for newcomers, and it's flexible and extensible.
  • Were also thought about other functions that we can use to provision other kinds of resources for developers. We ended up building an IDP that really helps our developers get on board fast. I want to take a moment to thank all of the developers at Stackhawk. Everybody really took part in this project.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Conf 42 cloud native 2022 thank you for joining my presentation. I hope you've been enjoying all the other presentations going on today. I'm my talk is entitled Local Microservice Development with remote kubernetes assist and it's really a story about how we at Stackhawk invested in our dev tooling and how we were able to use kubernetes and other cloud native services to facilitate our software development process. My name is Zach and I've been in startups for most of my career. I really love the opportunities and the excitement of startup companies, and I feel like you really get a breadth of experience that you can't get at larger companies or you don't often get at larger companies. Automation has been kind of a career theme for me, starting out with networks and systems and security, and these days I focus a lot on software delivery, but I still do that other stuff too. At Stackhawk we build tools for test driven security and our primary tools is a dast scanner that's built on the open source project OaSp Zap. Dast scans for dynamic application security testing, and it's a type of scanner that probes a web app for known vulnerabilities by attempting to exploit them. Among dast scanners, Stackhawk is the best for running in CI CD and for working in a team environment. It's also free for developers if you happen to be looking for a new challenge. I should also mention that we are hiring. Okay, so I want to talk about our app platform, because this is the context of our story. When we started about three years ago, we had a greenfield opportunity in front of us and we wanted to build out a really state of the art platform. So we looked at these CNCF rules for consistency between apps and we wanted to do like twelve factor apps, stateless design patterns, set up some common rules so that once we got started, we had a great platform to build on. We knew that there was going to be a run anywhere das scanner component to our architecture, and this is what people would run out in the field to scan their applications. We would also have a web UI and both of these components, and this is where you would look at your scan data and stuff, and both of these components would tie into microservices and rest APIs running in Kubernetes. So we had run anywhere DAS scanner react single page application UI tests API running on microservices and these microservices would run using Kotlin as a language to develop those microservices at the same time, we wanted to go ahead and invest some time in our build platform. So we built it with security in mind, but we also built it with fast iteration in mind. And we wanted to set this up to be a Gitops kind of platform where hopefully coders would just check in code, they'd build and test locally check in code, and then automated build systems would take over and deploy the software, as long as all the checks completed correctly. So to do this, we set up a bunch of AWS accounts, and these accounts would serve as our build environment and several runtime accounts so that we could isolate the build operations and the runtime operations. And then across that we would also stripe in different environments. So a production environment, of course, but also several staging environments where we could test things out. To help us in this build platform, we created a big bash script. Started out small, but it got bigger over time. We called it biodome. And biodome is this library of bash functions that would help us in our local development as well as in the pipeline when were building our software. And it would do things like get information about our AWS environments, like figure out what environment we're running in. And so what environment we would be targeting, what the account numbers were for these different types and app environments. But it would also contain some common functions for pushing and pulling artifacts, such as jars and container images. And it would also take care of deploying manifests and helm charts, or at least help with that stuff. To make it easy. At the same time, we adopted a platform called Gradle, which is a pluggable JVM tool similar to Maven. And this is a tool that is opinionated, and it makes it easy for a JVM based language like Kotlin to build and test and package artifacts. But at the same time, we went ahead and started to build our own library of functions and plugins that we could plug into gradle, and we called that library Ari. Finally, all these stuff runs on AWS code. Build, just runs build steps. It's a really simple system, just runs build steps in response to GitHub PR and were webhooks that come in. So then at a high level, what this looks like is we've got a repo for every microservice that we develop. And those repos live in GitHub. And as developers issue prs and merges to GitHub, GitHub would send a webhook over to code building in AWS. And code building would kick off the build job. These build job would use Biodome the big shell script library, as well as gradle, to perform all the building and testing and figuring out which environment we were in and deploying the software, as long as all the checks completed correctly. So we had all this stuff in place, and it felt like a pretty good platform. And we got started, and for a couple of weeks, it was really pretty cool. Developers could bring their own favorite idE. They would test locally on their laptops, code build, test, repeat, and then submit prs, and automated build and deployment would take over from there. But it turns out that it ended up being kind of a chore for developers to set up all the microservice dependencies that these needed to work on their target service. So if they're working on, like, service A and as we build out more microservices, maybe that microservice really doesn't do much unless it's got service BC and D. So we needed to figure out a way to make it easy to bring up the latest versions of service Bc and D so that they could just get to coding. And what we came up with was a system where we took Docker compose, and there's this cool feature in Docker compose, which is that it can overlay configuration files. Docker compose, if you're not familiar, is a way to lay out a bunch of services or containers in a YAmL file, and they can all talk to each other. So it's a nice way to put together a little assembly of containers. Much like microservices, it's a good way to develop locally. So what we did was we used this feature of overlays, and we concocted a scheme where each one of our microservice project repositories would contain its own compose file, a Docker compose file, and each repo would also define all of its microservice dependencies. Each microservice container, again, it exposes unique reserved ports. So for service a, you always know that it's listing on port 3200, and maybe a couple other ports in that range. In each project, we would have a script, simple script called local start sh, and that would pull those compose files from all those other projects and run them together. They would merge and run all these dependency microservices, and your target app service a would be left out of the mix, and the expectation would be that you work on service a locally, but you've got all these containers running, and they're listening on the local host address on their reserved ports, and service a can talk to them and that made it a lot easier to build and test. And when you came back the next day and other people had made changes to service BC and D, you could just restart your Devcube version zero Docker compose setup and you'd get all the latest images. Worked really well. Let me kind of talk through what this took like. So say you've got a repository SVCA service a in GitHub, that project repository is going to have a file called Docker Compose service a. It's a YAML file, and it defines its own services. And you can see that top box. The Docker compose file just describes how to bring up service a itself. And when we run this composition, of course, if you're working on service a, we're not going to bring up service a in a container. So this is actually used for other projects that depend on service a. So service a in this compose file, it defines, hey, what image do I need? Well, I need service a with the latest tag, and we're getting that from an ECR repo in AWS. It listens on port 3200, so that should be exposed to the local host address, and it depends on service b, C, and D. Then if you go to the other projects for service b and service c and service D, you'll find similar docker compose files, and those might define local dependencies. Like some of our projects require a database or a redis store or something. Those can be defined in their own compose files as well. But we can also say that they depend on other microservices. So then same is true for the repos for service C and service D. Okay, so when you run that local start script and you're in project service a, it's going to pull the docker compose files for service B, C, and D. And this merged docker compose file, in effect, is what you bring up when you're working on service a. Now, what that looks like is all those docker containers for those other services running locally on your laptop. Listening on the local host address, and then to the right there, that box on the bottom right shows what you'd see in your ide when you run gradle boot run to bring up your local application, it comes up and it can connect to the other services that are running on your laptop. This worked really well. It was a snap now to bring up all your dependency microservices and just start coding. And this worked for like another four or five weeks. But after a while, the number and the size of these microservices grew and grew and it became a little bit hard to manage memory between IDe and Docker desktop. And your build tests run sort of functions. And we started to question if we had gotten laptops that just weren't powerful enough. Well, it turns out we do not need faster laptops. What we really needed to do was figure out a way to offload some of those microservices. And we had heard about different ways that you can use Kubernetes to assist in your development process. But one thing that we really wanted to keep about our current process was this use of local tools. We really like our ides. We like building and running and debugging things locally and using profilers locally. So what we did is we looked around in the devtool space and we found something called compose with a k, and compose with a k, you might guess by the name. Basically it just can reuse your existing docker compose files and it can generate Kubernetes manifests based on those. So what we did was we created a script called Devcube Sh and this is Devcube version one. And Devcube expected that you had compose with these k installed on your laptop, and it would go through that process, it would pull down your dependency docker compose files, use compose to generate the kube manifests, then go and apply those manifests to Kubernetes in a namespace that's based on your username. And then it would set up Kubernetes port forwarding to reach those microservices locally. And if you haven't used this before, it's a way to use the cubecuttle command. I call it cubectyl. So if you hear me say cubectyl, that's just what I call it. There's a cubectl port forward command that allows you to set up a port forward so that you can reach your microservices reach your pods or services as if they are running locally. So now it should look just like before with the Docker compose setup. Except now all of those dependent microservices, all those microservices that your service depends on are running out in kubernetes and you can continue to develop your target app locally. So what that looks like is, do you remember with the Docker compose setup, what you end up with is a merged docker compose file. So we go through that exact same process, we pull down that Docker compose merged file and then from it we run compose to create a bunch of manifests. And the manifests end up being a bunch of deployments and a bunch of services, so that the deployments are to set up the pods that host your containers and the services are to make it easy to connect to those pods. We don't have to guess the names of the pods that get generated by the deployments. So then if you do a Quebectyl get pods in namespace. Z conger in my case, you would see all of your dependent services running in kubernetes, and you can reach them on the localhost address. And so it looks much the same as the previous process. You're on your laptop, you run devcube up and it creates your services and deployments, and then you can see that your pods are running, and then you run your service a locally and you can develop it and it is able to walks to all those services. It was really pretty cool, and we kind of had this sense of, it just felt pretty powerful. It was really nice. It was a big performance boost for everybody. Our laptops cooled down, we could dedicate much more memory to the ide and to the build run test process. And this was kind of a hit. It worked well, especially for UI developers who needed to basically bring up the entire microservice stack. So these worked for like a good two years. And it was an amazing feat of shell scripting and local dev tools. But these were, got to be honest, there were some issues. So it's built on an edifice of shell scripts, right? And shell scripts over time can be hard to manage when they get big. They're just not built to scale quite that much. So we had biodome, the bash function library, and it had gotten pretty big at this point. And it also depended on a bunch of cli tools, devcube too. And they were finicky about the versions of devtools that you were using. Not only like what sember version you were using, but whether you were on a Mac or a Linux box or a Windows machine, you had to pull down different packages. Every software project had a bunch of shell scripts themselves, and these were calling Biodome. And even Devcube was requiring kind of a lot of locally installed tools. And it was especially bad for new developers coming in. I mean, not too bad, but they really had to do a lot to get their laptops ready to start developing code. They had to install the AWS, Cli and terraform and Docker compose and this compose with a k and a bunch of other things. And these sprawl was really starting to be a bit much. Do you remember I mentioned that we also use Gradle as a build tool for our JVM projects. So if you're not familiar with Gradle, it's similar to maven or NPM, or make or cargo. It's really popular for JVM applications. It's a neat build tool because it's highly opinionated. It's really easy to get started developing with Gradle, but it's also super extensible. And you can use Kotlin or groovy to build plugins and tasks that you can run in gradle. And since you can use Kotlin, that was especially useful for us since we're a Kotlin shop. It gives us access to rich Java and Kotlin libraries. And those libraries, of course, you can do anything with these libraries. There's a ton of them out there now. And one of the key things that we can do, and that was useful for us is it gives us access to cloud APIs. So what gradle ends up doing for us over time, as we were starting to develop our own plugins and tasks, is it not only builds and tests and packages code, it can also pull these plugins that we're developing in our project that we call ARI. And we can start to do things like authenticate to code artifact, which isn't a tough thing to do. And most people just do a shell script to authenticate to code artifact if they use it as their artifact repository. But we built it as a task. We can also push and pull containers, push and pull objects to s three. We can get a lot of that information about our different AWS environments, and we can deploy workloads to kubernetes, and we can do all of this kind of stuff, even opening prs to GitHub using these native APIs of those services. So over time, what we found was that our old biodome shell script started to become less necessary overall for our build process. Over time, we were building a lot of these functionality that biodome was providing into ARI and into our gradle tasks. So we made a decision to formally try and get off of that shell script and start building the rest of those functions into ARI. And furthermore, we wanted to abstract all of those custom functions that we were using for gradle tasks, abstract those into libraries that could be used by other software applications. And the first software application that we thought of was a command line utility, sort of like Biodome. And so we just called it biodome. But this time biodome is built using Kotlin. So it's a nice CLI and it generally speaks directly to the APIs for the services that it manipulates instead of relying a bunch of local dev tools. So it became easier for new developers to get on board because they would just download biodome. And in fact, we've got helpers in Biodome to help developers install any devtools that they do happen to need. So the advantages, at least to us, were super clear. Now we can write all of these build functions in a strongly typed language that's compose. It's testable, it's much easier to scale. We can grow this thing to a very large size, and we know JVM languages can handle large applications and you can build on top of them. And we can directly access these cloud API libraries so we can manipulate AWS and Docker, kubernetes, GitHub, anything else that comes along. And it's all written in the developer's own language, in the language of our platform, Kotlin. So it's accessible to everybody to use and to manipulate and to add on to, to build on. Okay, let's come back to our story now. We were talking about Devcube. So Devcube was a big shell script, but now it's just a part of Biodome. We created a subcommand in the new Kotlin based Biodome called Devcube, and now it's got access to all these common build functions that we've abstracted out into libraries in ArI. We have less reliance on local tools, and it works directly against the Kubernetes and AWS APIs. It's opinionated, super simple to use for newcomers, and it's flexible and extensible, so anybody can go in and add functions to it if they want. So the new devcube has a. I'll describe it from a couple of angles. First, I want to describe the configuration language. The configuration language is again Yaml, just like Docker compose, and it looks similar to a Docker compose configuration file, but it's more tuned to our exact types of services. So now we define microservices and other dependencies for our platform apps, but we can abstract away a lot of common details. Like in our previous docker compose files, we were defining resource requests and limits so that as these devcube environments would come up in kubernetes, we were telling kubernetes how much memory and cpu we expected those Devcube environments to take, and that way kubernetes could auto scale to handle more of these devcube environments coming up and down. Now we can bake that into the libraries that we're calling. We've got some sample sizes or some typical sizes that we expect, and of course we can still customize it within our YaML file, but a lot of that stuff we can just assume will be handled in a rational way by default. There's also a lot of common environment variables that we'll bake in, and pod permissions in kubernetes and AWS. We can build that into our libraries as well, so we can abstract that away. And for the most part you don't have to specify any of those details. But if you want you can, because we built some customizability into it. Then there's the developer experience. So once you've got all those Devcube configuration files set up in all of the repos for our microservices, a developer working on say service a can just say biodome devcube up by default that reads in the config files from all the other dependency repos and it builds out manifests behind the scenes and it applies them to kubernetes. So you end up with a devcube environment that looks just the same as what we had previously, but there's other options that we can bake in as well. So we've got an option to do a devcube up, but bring it up in a local docker compose like environment. So it just brings up those same containers running in Docker desktop. Or if you want, you can also bring it up in a native jvm way, so they'll all be running on your local machine, but just natively directly on your metal and not in containers. And that ends up being kind of a nice way to go if you just have a couple of microservice dependencies, because it's simple and lightweight and pretty fast. We also have a function in there to take snapshots. So you remember some of these services have their own databases, and they'll bring up those databases and as you add sample data to it, maybe users and some sample scan data. You hate to lose that environment when you bring your devcube down and bring it up the next time. So we added a snapshot functionality so we can take a copy of that, store that backup in s these and you can select which backup you want to use or by default there's just a default name for your default snapshot and it's great, it's really handy. But in addition to that biodome command, the Biodome devcube command now we can do devcube like things and other functions in gradle. So now gradle can deploy devcubes, and those can be super handy as ephemeral iterations, test environments, for instance, or to deploy static environments for user acceptance testing. And we can also use these to create a bunch of manifests for deployment with argo CD or flux CD. So this can be a way for our applications to generate their own installer manifests that can be used by cloud native tools that expect to be working with those sorts of manifests. And what we had come across or what we had developed here was really a larger internal developer platform that was based on our own Kotlin code, reaching out to well established APIs for AWS and kubernetes and all these other cloud native services. And it was great because it was really allowed to our engineers and environments. It was built with knowledge about the way we do things and the way our developers like to work. And it was easy for newcomers to come in and just start using this tool, but it was also easy to add on to, and all of us can add on to it, including the developers. And it just made sense. And we've thought about directions that we can go in the future with Biodome and Devcube took. So some potential enhancements that we've been talking about, or at least I've been talking about, are why not add a web UI and get quick access to some of the common operations that are available in our library in Ari? So we could create UAT dev cubes for product. So maybe product could come along and spin up their own UAT environment for tests that they want to do. Or maybe product support could use Devcubes for troubleshooting. They could set up an entire platform and an entire scanner environment so that they could run some tests and run some experiments. We could also create crds and controllers to manage our Devcube environments. And one idea that came up pretty quickly was with some of the new functionality that we have for Devcube, it's possible that developers will spin up more than one apiece. They might spin up several. And over time that could be wasteful. So we might want to have a process that just runs and watches for Devcube environments that have been around for too long, maybe take a snapshot of them for safety and bring them down. Were also thought about other functions that we can use to provision other kinds of resources for developers. For instance, when we come up with a new microservice, there's a whole setup process for setting up the new repo and the associated build pipelines and it's automated enough. We use terraform to do that, but it's still some work. Why not just have a gradle command or a biodome command that just creates that repo and creates those build pipelines. When we go to push new containers to ECR repos, we have to create these ECR repos. Again, we use terraform to do that and it doesn't take that long, but it takes a little bit of work encoding to do and it's not all that dry. So we actually have done this. We've created a function that whenever you use gradle to go and push a container to an ECR repo, first thing it does is check to see if that ECR repo exists, and if it doesn't, it just creates it for you. Just a huge time saver. But why not also make functions available for developers to be able to create their own eks clusters, create these own Kubernetes clusters that they can use to test without fear of damaging any of the other environments? Well, that's our developer tool story, and of course it's not over yet. We ended up building an IDP that really helps our developers get on board fast, focus on their work, and take part in building more functionality into it themselves, since it's written in Kotlin, which is what they know. When we look back at all the efforts we put in, I think everybody at Stackhawk would agree that it was really worth it. It's been a huge enabler, not only for developers, but for the product team and for our ability to quickly deliver and iterate. I hope that our story is helpful for you and on your developer tools journey as well. And I want to take a moment to thank all of the developers at Stackhawk. Everybody really took part in this project. And just to call out a couple Casey is our chief architect. He really has been a champion for investing in our local build tools and doing it right. Sam Boland is our full stack engineer, and he's been a massive contributor to the whole effort. Topher Lamey started us down the path of using gradle plugins and it's just paid off handsomely. And Brandon Ward is a new software engineer who came in recently, but he had a bunch of great ideas for how to build good tools for developers. Then finally Scott Gerlock, who inspired us to build a really solid, scalable, secure platform on cloud native technologies so that our laptops would never be a roadblock to success and thank you so much to all of you here at Cloud native, and thanks for watching. Take care.
...

Zachary Conger

Senior DevOps Engineer @ StackHawk

Zachary Conger's LinkedIn account Zachary Conger's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways