Conf42 Cloud Native 2022 - Online

History of Software Engineering & how it applies to Infrastructure

Video size:

Abstract

The craft of Software Engineering has been around for decades and we have learned and improved a lot along the way. Things like keeping code in source control are taken for granted these days, but we remember the days when the latest code existed on production servers or on an engineer’s machine.

As improvements to the craft of Software Engineering gained momentum, the way infrastructure was managed lagged, remaining a manual process for many teams. Over time best practices for Software Engineering are being applied to infrastructure. The quintessential example is Infrastructure as Code.

In this presentation, we will talk about how the history of Software Engineering has and will continue to shape the improvement of infrastructure practices. Then we will introduce newer concepts like Environment as Code that will help further the craft of managing Infrastructure, beyond IaC.

Summary

  • In this presentation we'll talk about how the history of software engineers has and will continue to shape the improvement of in practices. Then we'll introduce newer concepts like environment as code that will help further the craft of managing infrastructure beyond infrastructure as code.
  • Infrastructure as code enables teams to deliver infrastructure rapidly and reliably. Version control has allowed application code to scale and engineers to collaborate better. With pipelines, infrastructure changes can be tested, deployment becomes repeatable and auditable.
  • Using a single infrastructure has code like terraform to deploy all the infrastructure needed becomes complex and hard to maintain as you scale. To avoid that, we started breaking the IAC into various pieces which help with collaboration and reduced feature lead time. We need to start thinking beyond infrastructure as code.
  • ESC is an abstraction over infrastructure as code. This provides a declarative way of defining an entire environment. ESC allows teams to deliver entire environments rapidly and reliably at scale.
  • The adoption of microservices made managing individual services easier, but added complexity to application dependencies. Environment as code applies the same concept to an entire environment by packaging all components within an environment, along with the dependencies between those components. This allows you to push things desired state to source control.
  • Gitops makes the entire deployment process configurable in code. It uses the git pull request workflow. Similarly you can apply Gitops for environment has code. Infrastructure has code has already piqued the interest of many teams.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Everyone. In this presentation we'll talk about how the history of software engineers has and will continue to shape the improvement of in practices. Then we'll introduce newer concepts like environment as code that will help further the craft of managing infrastructure beyond infrastructure as code my name is Ada Shah. I am the founder and CEO at Zlifecycle. We as software industry has always learned from other industries like Waterfall, who remembers? Waterfall came from construction, lean came from how Toyota used to do their production and still do it. And we are learning how to handle software incidents from industries like airlines. This has helped us grow and helped us improve as an industry. In this talk, we will look at history of software engineering as overall and then we will see what of these practices have already helped us manage infrastructure and how we can evolve this into using other aspects. We'll start with version control, which around 1970s is when we started using that. I can't even imagine a world without version control now, but back in the days, the latest code used to be on an engineer's machine. After that we'll talk about deployment pipelines around 1990s 1991 is when we started using deployment pipelines that gave us more predictable deployments. Then we'll talk. But microservices around 2005 is when we realized that we need to start breaking our monoliths into smaller services to gain faster feature lead time. Taken around 2013 ish we started using continue. We'll look into that and how that helped software engineering and taken how that can be applied to managing infrastructure. From there on, Kubernetes is what became the de facto orchestration engine. So around 2014 we started using that and then let's see how we can apply the same to infrastructure. And then the latest one here is Gitops. Around 2017 is what it came in, and then from there on we started using that to deploy applications. And then we'll see how that we can use to deploy infrastructure. Before we get started though looking at those aspects, let's define what an infrastructure as code is, because the first three topics that I mentioned in the history are all achieved through infrastructure has code. So I'm sure you all know what infrastructure as code is. It just helps us automate provisioning of infrastructure resources. It is one of the key DevOps practices that enables teams to deliver infrastructure rapidly and reliably. All right, so before version control became ubiquitous, determining the latest version of an application might be a matter of asking what code is on an engineer's local machine who deployed it last. We realized this is unreliable, error prone, and makes collaboration difficult. Version control has allowed application code to scale and engineers to collaborate better. Infrastructure continued to be managed manually long after application code left those days behind, whether from a cloud provider's UI or the command line from developer's machine. Managing infrastructure manually makes it difficult to make changes or troubleshoot regressions. Infrastructure as code made it possible for infrastructure changes to be stored in version control, and it has saved engineers the toil of tracking the current state of your infrastructure and made troubleshooting easier. As you can see here, you should put everything in source control, even your bash script if you're using once in a while, and then make it available to everyone so that anyone in your organization can see how their infrastructure is deployed. Initially, it was not uncommon for applications to be deployed with a bash script, or even manually copying from a team member's laptop with tests run ad hoc, if at all. The frequent and reliable deployments of today were made possible through pipelines that have continuous integration and deployment. So this helps stabilize and made those deployments of your application more predictable and repeatable. ISE has also benefited from the use of pipelines. With pipelines, infrastructure changes can be tested, deployment becomes repeatable and auditable, and managing dependencies and secrets is made easier. This diagram shows an evolution from running infrastructure as code on an engineer's machine to a shared environment using pipelines. Let's dig deeper into an ISE pipeline. So as you can see on the left, we have the continuation integration bit, which means that we can run some static analysis against the ISE. We can run any unit tests, see if those things pass before pushing our changes. After that's done, we do testing and validation, and for that we can provision temporary environments using ISE, run compliance tests, integration tests, any security checks that you want to have, and any smoke tests. This gives us confidence that our changes will work in production. And then if that works, then we go to the last step. These is we deploy those changes and provision our infrastructure in production, and then we run some smoke tests to validate that production is what we want. So this is how we have been using pipelines with infrastructure as code to achieve more reliable provisioning of infrastructure. As applications grew more complex, deploying the entire application as a monolith can become a bottleneck to continuous delivery. A bug in one part of the code base or infrastructure prevents the entire application from deploying. Domain logic and tech stack decisions are often tightly coupled throughout the code base. Because it's all a monolith code base, you have everything deployed as a single unit. They all are tightly coupled, so you can't use different technologies. There teams began being monolith applications into smaller self continue components or microservices that could be managed and released independently. This helped them go to market faster, reduced the feature lead time, and it made working with applications more efficient. Similarly, basically here's an example of monolith to microservices. If you have a retail application where you have a website that allows people to list down products, they can check out, they can make payments or billing, has shopping cart aspects, you have the monolith application on the left. And then with microservices architecture you can start being those into separate services. You can even have your UI layer or data access layer and all of those as separate services. The same can be said for infrastructure. Using a single infrastructure has code like terraform to deploy all the infrastructure needed becomes complex and hard to maintain as you scale. To avoid that, we started breaking the IAC into various pieces which help with collaboration and reduced feature lead time for infrastructure breaking ise into loosely coupled components, taken it easier to understand and maintain it. So here's an example where you, let's say have a monolith infrastructure as code, where you have your networking platform, EC two and s three bucket and database and all of that together in a monolith with this. If you run your terraform, you have a single state file. If you make any changes, you have to run that again and again and that becomes slower as you scale. So in case of micro IAC, you basically break each of these components into separate terraform runs. So you would have a separate networking which you would run separately. After networking you would run, let's say your EC, two instances, terraform or Kubernetes terraform or postgres terraform. And then after that you would run your kubernetes add ons terraform. So breaking these separately allows you to have these loosely coupled components and you can make a change to one, just run that and have a more efficient way of provisioning infrastructure. So this has been working great, right? Well, not actually, because ISE is powerful and crucial tool. However, if you have ever managed many divergent environments, something like this one where you have your networking separate and you have bunch of these different components with your databases, let's say you have sage maker, let's say you have kubernetes, clusters and s three buckets and all that with existed tools, you know, there is still a long way, a long road ahead in improving developer effectiveness with infrastructure as code. And that becomes painful. So here's how you would actually provision an environment like that, using existing tools, you'd basically hand roll pipelines. So write a lot of logic in your jenkins or Circleci or whatever pipelining tool that you are using and that logic will basically run your various ISe components and then you'll manage that complex dependencies within the pipeline. So while your ise is declarative and idempotent, these pipelines are not. You have to write a lot of custom code to provision an entire environment. So for example, logic for executing networking layer first and then executing the Kubernetes layer and then kubernetes add ons layer, all of that logic needs to be written in your pipeline. Code tear down must be supported as well. If you have any failures that impact the environment, you need to account for that as well. And a lot of that just comes down to then manually solving these problems outside these pipelines. So as you can tell, these options are inefficient, costly, and typically it requires a dedicated team. Again, we are talking about more complex environment deployments. If you have a very simple setup, you don't need to worry about that. Just having some simple IAC and a pipeline would do the job. So since infrastructure as code is great, it helps in a lot of ways, but has becomes limiting as we scale. We need to start thinking beyond infrastructure has code. We need to start thinking how we can use infrastructure as code for things that it's powerful for. But how can we add on things on top of that to help us achieve scalability, achieve something in production that is more efficient. So that's when code the right side of this history of software engineering. So there are other practices taken, version control, deployment, pipeline and microservices which we are already using in infrastructure. How about using the other practices that are out there like continue or package managers or kubernetes? That has worked pretty well with applications and Gitops, that has worked well in recent years. So now we'll look at those practices and how they can be applied to manage infrastructure, and we will use environment as code as a mechanism to achieve those. So before we dig in deeper there with those, these, let's actually define what environment has code is. It is an abstraction over infrastructure as code. So you're still using infrastructure has code under the hood. This provides a declarative way of defining an entire environment. It works well with a control plane that manages the state of the entire environment, including any relationships between various resources. It detects drifts as well as enables reconciliation. It supports best practices like loose coupling, item potency, immutability, et cetera for the entire environment. ESC allows teams to deliver entire environments rapidly and reliably at scale. I know it was a long definition, but I think it was important to kind of read through it. So now that we kind of define what environment as code is, and we'll look into it deeper as we go into these various topics. So let's come back to one of the practices that has worked well for us in software engineering, working with applications. So yeah, the adoption of microservices made managing individual services easier, but added complexity to application dependencies. Managing dependencies between various microservices can put even a most robust continuous delivery pipeline to the test. This makes deploying an entire application environment a pain point. Similarly for infrastructure as code. Code creates complex dependency graphs that can be difficult to navigate. So several tools have gained popularity after microservices because they help simplify complex deployments. Containers is one of those tools that helps organize dependencies so that an engineer doesn't deploy a server and then deploy their applications along with that. On top of that, helm charts and customize taken that concepts even further. Serving has package managers for a suite of many services. So these tools have kind of helped us improve developer effectiveness and helped us deploy to production with more confidence. Similarly, environment s code applies the same concept to an entire environment by packaging all components within an environment, along with the dependencies between those components. Let's look at how that an example code for environment as code looks like. But before we do that, let's kind of just at a high level show the difference between environment as code and infrastructure as code. So infrastructure as code automates various Lego pieces. If you take the Lego analogy, you have various pieces which in this case are infrastructure resources or could be a group of them. Infrastructure has code automates those and is good at doing that. What environment S code does is it automates how those various Lego pieces are connected to make up a Lego toy, which is where the most value is. Getting your entire environment, your databases, your networking, your kubernetes, clusters or EC, two instances or s three buckets. All of that together is more meaningful and useful to a product teams. Now let's dig a little deeper into provisioning workflow using environment as code. So as we can see at the top, you declare your environment as code, and we will look at an example in the next slide. Once you push that to your source control, there is a control plane associated with environment as code that picks up the change. It sees there is a drift because you're doing it for the first time or maybe there is a drift on an existing environment and it starts reconciling. But it also manages state for your entire environment. So all these dependencies that are there and what their status is so that you can make intelligent decisions or make provisioning as and when needed. From there on, control plane kind of starts reconciling stuff at the top node, like networking, being here first, then your Kubernetes cluster, your postgres database or Kubernetes add ons after that. So control plane runs these various infrastructure as code components in the right order. As you can see in that box it says terraform. And each of these boxes, the networking, the platform kads and postgres and all of these are terraform code that have their own state. But what environment s code does is it also maintains state at an environment level because you want to know what's the state of the entire environment and how those various pieces are connected. So yeah, here's an example. This is our custom format, so you can use other formats as well. So as you can see from the top we are using custom resource definition, the Kubernetes custom resource definition. And as I mentioned, this is our format. So at the top we have your environment name, your team name, some other parameters. But as you can see that we have various components, various Lego pieces that make up the entire environment as part of this file. And then you will see here on line 54 and 55 that each component will have a type. In this case we have terraform as a type because we are using that to provision platform eks. And then you also can specify dependencies. That's how you kind of tie everything together. So in this case, eks depends on networking. So this allows you to define an entire environment. This allows you to push things desired state to code to source control. And then environment has code kind of picks it from there and then provisions your environment by going through these components one by one in the right order. So with increase in microservices adoption, it became clear that containers make it easy to manage those microservices. But then we had the container orchestration war that Kubernetes won and became the de facto orchestration tool. Kubernetes made it easier to manage microservices. These applications, especially due to its controller pattern that watches the state of your cluster, then makes changes where needed to bring it to the desired state specified in the code. So it does a lot of those things for you and makes it easy to manage. Environment s code applies the same logic to infrastructure using the controller pattern from Kubernetes Environment s code has a controller that tries to move the current state of these environment to the desired state in the code. See in the diagram that we have the desired state. On the left we have a control loop. It kind of observes and detects drift and tries to move the desired state to the current state. Current state to the desired state. And then there is this reconciliation aspect there as well. With environment has code though we would recommend having an approval step that shows you the plan first and you make a decision and taken this is based on what? Because you might have some destroyers there, right? You might be destroying something. And especially if you have a database, you don't want to just destroy anything without actually checking the plan and approving it. So this is how applying Kubernetes concepts and using Kubernetes to provision infrastructure helps. Yeah, we looked at this, but quickly. This control panel part is what we were talking about just now. This control panel running in Kubernetes helps you do that drift detection and reconciliation and allows you to bring back your environment if they are out of sync from your desired state back to the desired state. So far we have talked about improvements made to close the gap between an engineers, an engineer adding work to a code base, and that code base working and scaling in production. Turning to code submission itself as a tool for deployment is a natural next step. Kubernetes has already replaced a large portion of deployment pipeline logic with declarative configuration. Gitops on this concepts Gitops as a concept and it makes the entire deployment process configurable in code. It uses the git pull request workflow so that you can have a pull request, you can get it approved from someone, and then once you merge your changes, it picks it up and deploys those changes based on once it's merged to trunk. So here's a typical workflow from weaveworks that talks about deploying your changes using Gitops. So as you can see on the left you have your dev environment. Let's say you create a pull request, it gets approved and merged to git and then it picks up the change. You run some CI and then eventually you push your changes to container registry and then from there it gets deployed to your Kubernetes cluster. Similarly you can apply Gitops for environment has code. So if we start on the left, we push our environment has code to a branch, we run some validation, maybe some static analysis or best, and then we create pull request. One of your team members can approve the change or reject it and ask for a change. And then once they merge the PR, the control plane observes that repo picks up the change and it starts reconciling and give you your environment, your various infrastructure components. So that's the talk. Thanks for listening. The software industry has made great strides in improving software engineering. I'm hoping that things talk helped demonstrate how we can continue to build on existing software engineering practices to achieve similar gained in productivity and performance with respect to managing infrastructure. Infrastructure has code has already piqued the interest of many teams. Environment has code is on the horizon. Thanks everyone for listening. Hopefully this was helpful. You can find related content about the topic at the bitly link below. Please feel free to reach out if you have any questions. Thank you. Have a nice day.
...

Adarsh Shah

CEO @ zLifecycle

Adarsh Shah's LinkedIn account Adarsh Shah's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways