Conf42 DevSecOps 2023 - Online

Infrastructure as Code (IaC) Security Best Practices and Strategies

Video size:

Abstract

In this talk, we will dive deep into the best practices and strategies for securing Infrastructure as Code (IaC.) We’ll focus on the relevant techniques for risk mitigation so that we’ll be able to protect the infrastructure along with the applications and services running inside these resources.

Summary

  • Joshua Arvin Latt is the chief technology officer of Noworks Interactive Labs. He is also an AWS machine Learning hero. Next year there will be a new book and it's called Learning serverless security. If you're into cloud security, then this book is for you.
  • The more we use the cloud and the more complex the system gets, the more components it would have. One of the techniques available would be the usage of infrastructure as code tools to convert a complex infrastructure into templates. It's important to avoid insecure defaults and of course regularly check for announcements in cloud platforms.
  • The challenge is what if you decided to launch a server in a public subnet and inside that server you're going to run the IAC templates. If that server gets compromised, what could possibly happen next? The next best practice would be to track and manage changes using version control tools.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hi everyone, thank you for joining my session. Today I'll be talking about infrastructure as code, security, best practices and strategies. Before we start, let me introduce myself. I am Joshua Arvin Latt and I am the chief technology officer of Noworks Interactive Labs. And I'm also an AWS machine Learning hero. I am also the author of the books Machine Learning with Amazon Sagemaker cookbook machine learning engineering on AWS and building and automating penetration testing labs in these cloud. Next year there will be a new book and it's called Learning serverless security. So if you're into cloud security, then this book is for you. So let's begin. So in the past we usually think of web applications as more or less single components. And after a while we realize that there are several parts to it, like the front end aspect, the back end and also the database. So the front end connects to the back end and these back end connects to the database. Basically this is still these same web application which serves the end users. The more we use the cloud and the more complex the system gets, the more components it would have. So for example, if you have an architecture like this where you have several components like a CDN, a load balancer, as well as the servers, then we're able to make the system more resilient, especially if the servers can auto scale depending on how many users are able to use the applications. So if there are two to three times more users, then maybe we can add more servers inside these setup. This is one of the advantages of having a distributed setup, but of course it would involve more resources and more components at the same time. If you want to incorporate security resources like firewalls, then we can easily add those and bind it to existing components. Thus when it comes to discussions, it may make sense to just have architectures where you have different building blocks. These each building block has a purpose and multiple building blocks would be grouped together to perform specific actions like preventing against certain types of attacks or maybe allowing the system to scale when there's a lot of traffic. That said, when working with a lot of resources, it becomes more important for us to manage these resources in a way that allows us to configure these in a more efficient manner. And one of the techniques available would be the usage of infrastructure as code tools and solutions to convert a complex infrastructure into templates. So when I say templates, these are basically just text files containing the configuration of the resources which would be created. So there, once you have this infrastructure as code templates, we are now able to build multiple environments from that template. So for example, if we have a staging environment and a production environment, both of those environments can come from the same template, and of course they would be configured a bit differently with the right configuration parameters. So more or less you have a template, a text file, and then you would have various configuration parameters depending on where you're going to deploy the resources. So if it's a staging environment, then there would be a staging configuration, and then if there's the production environment, then you would have of course the corresponding production configuration with of course these larger resources deployed in a production setting. So one of the best practices when building environments, especially in the cloud, using IAC tools, for example terraform, one of these best practices would be to create separate IAM rules or basically securing configuration and bind those to cloud resources. So at the moment you'll probably be asking why. That's because each of the resources number one needs to be tagged properly, and each of the resources should be properly configured as well. So after tagging those resources one at a time, so you're able to properly manage the assets, you're properly able to count and identify which resources have been created and which ones are missing and which ones need to be modified, and then which resources need a specific set of permissions. And there you're able to identify which could be a weak link when it comes to security. So when building infrastructure resources using IAC tools, it's important to avoid insecure defaults and of course regularly check for announcements in cloud platforms. The tricky part with using infrastructure as code is the templates and examples available online may already be outdated. And these are some cases where the current configuration specified in those default templates may end up being insecure, meaning they may have security vulnerabilities. In some cases, when you use generative AI tools to generate these types of templates, you might end up producing something which is already insecure, something which has vulnerabilities. A good example of this would be an s three bucket created using IAC. But if you accidentally opened that bucket for access to anyone in the world, then anything you store inside that storage container could easily be accessed and downloaded by everyone else. So if that storage contains, let's say a database dump or let's say a set of files containing very sensitive information, then it's going CTO affect your organization as well. So be very careful about this and regularly check for announcements in cloud platforms, especially if they decided to change the defaults into something more secure. So this is important, especially if you use cloud platforms like AWS, Azure and GCP. Because even if you already have the IAC templates, then those templates may not automatically reflect what has been announced recently. Now let's talk about secret management and permission management. So when running IAC tools, IAC tools of course require credentials, something like a secret key or an access key to allow them to create resources inside these cloud platform. So the challenge there is what if you decided to launch a server in a public subnet and inside that server you're going to run the IAC templates. When I say run, you basically have the IAC templates ready there and you use these command line. CTO basically convert those templates into actual resources. In most cases, developers and engineers would do the shortcut where the server would have an IM role with super admin permissions. Of course that would allow you to run anything and build anything from that server. Thus it's super convenient for the engineers. CTO have this type of setup. Unfortunately, the problem there is that server is tagged as high risk because for one thing it's in the public subnet. If that gets compromised, then anyone who has access to that server would technically be able to perform anything in that cloud environment. So right now you might be asking me why or how? Because even if I'm just talking about the concepts, most of us have no idea how these attacks are actually performed. So getting back to the example earlier, let's say that you have a server, these, your IAC code is converted into infrastructure and that server has super admin permissions. If that server gets compromised, what could possibly happen next? There are a lot of things that can happen. In some cases, if a team decides to use, let's say containers to do things, or maybe have different IAM resources configured, then any of these things could happen. Maybe container escape is possible, especially if you decided CTO utilize containers to run IAC code inside it. So a lot of people think that using containers would be a silver bullet. Unfortunately, if you accidentally run containers with excessive permissions, it's also possible to perform container escape. That is, someone inside the container can access the server where these container is running. The next step there is once an attacker is inside a server, IAM privilege escalation is possible, meaning that someone with very little permissions could technically find a way to access the entire account using the right set of steps. When I say the right set of steps, maybe other IAM resources could be created and those could then be used to get extra access, which would allow an attacker to perform malicious actions or operations that would include attacking other organizations. That would include deleting all the resources in your account and also creating superbic resources which would end up closing the account. Also in other cases there could be databases or data stores which contain sensitive information, and the extra access acquired during privilege escalation can be used to access the other databases and data stores. The next best practice would be to track and manage changes using version control tools. So the advantage when having IAC solutions as part of the process is that you have your infrastructure as code, and when you have the resources as code, you're able to keep them as files and use something like git to manage the changes. So if you have version one, version two, and version three, then you can easily check and iterate using a very similar process as what is followed when developing web applications, for example. So if you have a first version and then you have a new version, instead of deploying that new version in a production environment, you can technically best it out first in a test or staging environment, and then when your application is unaffected, then you can now get it deployed in a production environment. So again, resources are now converted into code. So everything you can do with code, you can now implement it in your IAC process. So here we can see an analogy where you have here a picture of evolution. You start with previous versions and then you'll end up having more modern versions, which would probably take a lot of iterations. And when you're able to start this process, well then you can easily find multiple variations. You can have another version which makes use of previous code bases. And again you can reuse templates, you can lay your templates and you can make them as fine grained as possible using the right set of techniques. So again, I'm just re emphasizing the point that this is a very powerful technique in order to manage IAC code, especially if you have insecure defaults at the start. And then you realize you have to update the subnet configuration in your IAC code and convert it into something more, securing so that the next time around attackers won't be able to attack certain resources now protected with the right configuration. In addition to that, the moment you convert your infrastructure as code, we can now use pipelines to analyze security vulnerabilities automatically. So there are different ways to analyze the code and the resources created from the code, and you basically have these pipelines. So in step one you have the code, you push it and the pipeline gets activated. However, it's important that we're very careful when managing resources inside pipelines, because even if we're able to detect these security vulnerabilities inside these templates. It's possible to have something like a Poisson pipeline execution, especially when you're utilizing cloud resources to run and convert these templates into actual resources. For one thing, again, resources in the cloud would probably have IAC roles attached to these. So when running resources and running templates inside these resources, there's going to be an IAC role which is checked first before specific actions can be performed. So if that IAC rule has super admin permissions, then the problem there is if there's a script or a payload injected included in the template when the template runs, and then when that specific set of scripts get executed, then it could be possible for something malicious to be executed inside the pipeline environment itself. So very scary because a lot of teams prioritize the production systems and basically the web applications and the resources there. When it comes to securing production environments, however, the weak link could be any existing pipelines which are used by the development teams. So again, make sure that everything deployed in your environment is properly secured. Next, it's important to protect specific resources from accidental deletion or modification. So a lot of us just think of IAC as a simple process where we write code and then resources are created. However, IAC involves modification and deletion as well. So what if you created databases using IAC code? So here, what if suddenly somebody deletes the resources using an IAC solution? Then your production databases could be deleted automatically as well. So make sure you know the proper configuration parameters to ensure that certain resources in your infrastructure are not modified or deleted by default when using IAC solutions. So that's pretty much it. Today we learned a lot of things and we're able to learn how to secure resources and systems built using IAC tools. Thank you so much and have a great day ahead.
...

Joshua Arvin Lat

CTO @ NuWorks Interactive Labs

Joshua Arvin Lat's LinkedIn account Joshua Arvin Lat's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways