Conf42 DevSecOps 2021 - Online

From Infrastructure as Code to Environment as Code: Challenges scaling IaC and how to resolve them

Video size:

Abstract

Infrastructure as Code(IaC) has made managing infrastructure easier in a lot of ways, but there are many challenges that companies accept as the cost of adopting IaC especially when scaling. IaC is good at provisioning individual resources (or a few of them together) but engineering teams want an entire environment with various components like networking, platform (ec2/eks), database, s3 buckets, etc. to deploy and operate their applications.

To provision and tear down an entire environment, these teams have two options. They can either hand roll pipelines to manage individual resources and then manage complex dependencies between these resources within those pipelines or create a monolith IaC for the entire environment. These approaches are inefficient and slow down feature development and innovation. They also make replicating, visualizing & understanding environments difficult. What if there were a better way?

This talk digs into these challenges to try to better understand them and then look at how to resolve them. We will introduce Environment as Code (abstraction over IaC) that enables teams to provision & teardown entire Environments in an efficient way and promotes best practices like loosely coupled infrastructure resources.

Key Takeaways:

  • Challenges scaling Infrastructure as Code
  • What is Environment as Code?
  • How Environment as Code can help resolve those challenges?

Summary

  • From infrastructure as code to environment has code challenges, scaling ISE and how to resolve them. I will also introduce environments code which has helped us resolve those challenges. Some of what you will hear today around environments code is new and I would love to hear what you think about it.
  • Infrastructure has code helps us automate provisioning of infrastructure resources. It is one of the key DevOps practices that enables teams to deliver infrastructure rapidly and reliably. This talk is focused on teams who have already broken down their infrastructure as code into smaller runs.
  • From IAC to environment as code environment S code is an abstraction over infrastructure has code. It provides a declarative way of defining an entire environment. Allows teams to deliver entire environments rapidly and reliably at scale.
  • Code manages the state of the entire environment, including any dependencies between various components. These are infrastructure as code pieces that have their own state. terraform is responsible for provisioning resources in your cloud provider. For the tear down, it reverses the logic and starts from the leaf node.
  • Code environments code manages an entire environment. It should support defining that entire environment with various infrastructure components. Item potency and immutability are key principles for infrastructure as code. This simplifies the provisioning of infrastructure and reduces the chances of inconsistent results.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hello there. Today's topic is from infrastructure as code to environment has code challenges, scaling ISE and how to resolve them. This talk is based on my and my team's experience working with and helping various companies adopt infrastructure as code and the challenges we have seen scaling infrastructure as code over the years. I will also introduce environments code which has helped us resolve those challenges. Some of what you will hear today around environments code is new and I would love to hear what you think about it, answer any questions and discuss it further. My name is Adarsh Shah. I am the founder and CEO at Compuzest. We all know what infrastructure has code is. It helps us automate provisioning of infrastructure resources. It is one of the key DevOps practices that enables teams to deliver infrastructure rapidly and reliably. Here's a typical evolution of your ISE. You probably start with a very simple setup, a monolith ISE with a single ISE run. If you are using terraform, then that means a single state file. As you can see in the diagram on the left, it has networking platform EC two and S three bucket, all in a single monolith ISE run. As you scale, you start breaking the ISE into separate smaller iscs. As you can see in the diagram on the right, there is a separate networking IAC and then platform EC Two, platform KDAs and postgres that depend on networking and then Kadas Addon that depends on platform K has this talk is focused on teams who have already broken down their infrastructure as code into smaller runs or looking to do that. If you have a very simple setup and can just execute it as a single run, you won't have most of the challenges we talk about today. For the execution of your IAC, it probably starts with running it on the engineer's machine. But as you mature you have more members of the team that want to run the IAC and want a more reliable and stable execution environments. You create a pipeline or use Gitops to execute the infrastructure as code from a shared environment. This talk is focused on teams who already use pipelines or Gitops are looking to do so. Applications teams need an entire environment, something like you see in the diagram, to deploy and operate their applications. Just getting networking platform K eight s or RDS database on its own is not going to allow them to run their applications. They need an entire environment, whatever that might mean for the team that has these various infrastructure resources and dependencies between them. In this example environment, networking needs to be provisioned first and then platform EC two, platform K eight s and then the k eight s addons. If you want to get an environment like this using infrastructure as code, here are your options. Create a monolith ISE but we all know this is bad. It creates a tight coupling and is not recommended unless, of course, you have a very simple use case, the state files become large and it becomes painful to maintain them with monolith ISE. Option two involves hand rolling pipelines using tools like Jenkins, Circleci, et cetera to run the ISE and manage complex dependencies in the pipeline code. While your ISE is declarative and idempotent, these pipelines are not. You have to write a lot of custom code to provision an entire environment. Teardown must be supported and any failures errors that impact the environment must be accounted for as well. If there are failures or errors. While execution, they usually get managed manually by engineers. As you can tell, these options are inefficient, costly, and typically requires a dedicated team. Here are some other challenges scaling ISE if you want to follow principles like immutability for your environments or make it easier to share best practices implementation of environments across various teams having a mechanism to easily replicate environments is critical. Since the pipelines I mentioned in the previous slide are not ideal for managing entire environments, it becomes painful to replicate them. Teams spend a lot of time writing custom code to replicate environments. Visualizing and understanding environments are challenging, too. Teams also struggle to do that. Trying to find that information by going directly to the cloud provider's dashboard is even more confusing. If these want to troubleshoot an issue, share knowledge between teams, or make any changes to existing environments, they need to go through a painful and time consuming process. A lot of teams create diagrams for their environments with various infrastructure resources and how they are connected using tools like Vizio or draw IO. But these diagrams get out of date soon with real environments. Instead of helping, they actually provide incorrect information and can cause confusion over a period of time due to human error or indirect changes. Provisioned infrastructure drifts from the desired state in code while with existing solutions, like using a pipeline to execute ISe. Drift can be directed since ISE is declarative, but only when that pipeline executes the IAC next time, and it will only find drifts within that particular IAC. Teams should know about the drift right away, and not just for individual infrastructure resources or a few of them, but for the entire environment and various component dependencies, so they can remediate any issues as soon as possible. Now that we understand the challenges scaling infrastructure has code, let's understand what environments code is and how it helps resolve those challenges. We can start by looking at a higher level from IAC to environment as code environment S code is an abstraction over infrastructure has code, as you can see in from IAC to environments as code declarative and executes and manages various infrastructure as code components. Various ISE components are responsible for provisioning infrastructure, resources, etc. Is responsible for executing infrastructure has code in the right order. If we use the Lego analogy, infrastructure has code automates various Lego pieces that are your individual infrastructure resources, or a few of them together, while environments code automates how these Lego pieces are connected to make up a Lego toy your entire environment. Here's a definition. I know it's long, but I think it's important to go from, from, from IAC to environments as code abstraction over infrastructure as code that provides a declarative way of defining an entire environment. It has a control plane that manages the state of the environment, including relationships between various resources, detects drift, as well as enables reconciliation. It also supports best practices like loose coupling, item potency, immutability, etc. For the entire environment, etc. Allows teams to deliver entire environments rapidly and reliably at scale. Now let's dig deeper into provisioning from, from, from from IAC to environment as code. At the top, we define our environment has code, which is declarative. We push it to source control control plane that's associated with environments. Code picks up that challenges. It manages the state of the entire environment, including any dependencies between various components, their statuses, et cetera. And then it starts reconciling various infrastructure components. These are infrastructure as code pieces that have their own state. So if you're using terraform like you have in this instance, networking is actually terraform code. And terraform manages the state of that networking component. And then once networking is done, it provisions platform kdas and postgres. So the control plane that's associated with environments code manages in what order these components run, and these after that, the kades add ons runs. So as you can see, the control plane is what that manages all of these various pieces. But infrastructure's code, terraform in this case, is actually responsible for provisioning resources in your cloud provider. And then for the tear down, it reverses the logic and starts from the leaf node and then goes up the chain. So now that we looked at what environment s code is, let's look at the various attributes of environments. Code environments code manages an entire environment, so it should support defining that entire environment with various infrastructure components in an easy to understand format. It also supports specifying various relationships between these components. This diagram shows can example environments code using the YAML custom format. We use this for our product zlifecycle, but it doesn't have to be a YAmL format. Anything that you can use to specify the entire environment, any format can be used has. You can see on line 54. It allows you two specify the type of infrastructure has code, which is terraform in this case, and also that this component depends on the networking components. Environments code promotes loosely coupled ISE components like you see in the diagram. It brings these loosely coupled ISe components together to give an entire environment like infrastructure has code tools have state files to capture the state of each ise run. Environments code also has a state file that captures the state of the entire environment, including the various components and their relationships has. You can see in the diagram it has operation and status that tells you about the last run, of what type of operation it was and what's the current state. It also tracks component operation and status from the last run. Item potency and immutability are key principles for infrastructure as code. How do you apply these two an entire environment? Let's first understand what they mean. In case you're not aware of these principles. Idempotency means no matter how many times you run your IAC or your code and what your starting state is, you will end up with the same end state. This simplifies the provisioning of infrastructure and reduces the chances of inconsistent results. So let's look at when you start at the top. Let's say you want three vms. Your code provisions three vms in non idempotent case, if you reapply the challenges, you get three more vms. So if you are expecting three vms, you actually end up getting six instead. On the item portent side though, if you reapply or change it knows that you already have the three vms, so it won't provision any new vms if you reapply the changes. So you end up getting the three expected vms. With EAC, you can achieve item potency for the entire environment has it tracks state for the entire environment and knows what the last operation was and its state pipelines don't do that for you. Configuration drift is a huge problem with infrastructure. It occurs when over a period that there are changes made to infrastructure that are not recorded and your various environments drift from each other in ways that are not easily reproducible. This usually happens if you have a mutable infrastructure that lives for a long time. These issues can be resolved by using immutable infrastructure. So as you can see on the left, if you have version one of your code or your infrastructure code deployed, you make some changes to your code and in case of mutable infrastructure, you apply the new version to the same infrastructure. So you have long lived infrastructure. In case of immutable infrastructure, when you make a change, you actually provision a brand new set of infrastructure with the new version, redirect traffic to that new version, and then get rid of the old infrastructure. Immutable infrastructure means instead of changing an existing infrastructure, you replace it with new. By provisioning new infrastructure every time you're making sure it is reproducible and doesn't allow for configuration drift over time. Why not apply this principle? Two, the entire environment. You can do that using environment has code. You can replace entire environments by bringing brand new environments up instead of changing existing ones. To achieve immutability. As mentioned earlier, teams usually create diagrams manually and then keep it updated as they change code. You all know how that goes. The diagrams get out of date over a period of time and provide misinformation and are more harmful than from from from from from IAC to environment. As code understand format, you can use it to create a visual that helps teams understand their, as well as other teams environments within their organization. This screenshot is from our product zlifecycle that is created using the environment has code concept environments code has a control plane that contains a reconciler that observes whether the desired state and the current state have drifted and then reconciles them. You might be thinking this looks like Kubernetes controllers and yes, it is based on the same concept. In fact you can use Kubernetes controllers to achieve it. In these case though, it probably makes sense to have an approval step that shows the plan before bringing the actual status back to desired state. And this might involve, as this might involve destroying or recreating infrastructure companies and promoting changes across various environments becomes a lot easier with environments from from from from IAC to environment as code understand format and push to source control. You can compare the code for various environments and promote changes. You can also use Gitops for the entire environment. Using environment has code. So let's look at how the Gitops flow would look like with ESC. We start on the left, you define your environment code, you can add push to a branch, you validate if everything is valid, and create a pull request. Someone from your team looks at the PR, approves the PR, and then it eventually gets merged to main. There is a control plane that's associated with environments code that observes the repository, picks up a change, and then starts the reconciliation process. Thanks everyone for attending my talk. Please feel free to reach out if you have any questions about environment has code, infrastructure has code, and also we have a product that uses the same environments as code concept, so check it out. It's called zlifecycle and as I mentioned at the start, I would love to get your from from from from from from IAC to environment as code again.
...

Adarsh Shah

CEO @ CompuZest

Adarsh Shah's LinkedIn account Adarsh Shah's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways