Conf42 DevOps 2023 - Online

Why you should never use static shared secrets in GitHub Actions

Video size:

Abstract

This talk will focus on the existing vulnerabilities and downsides of using shared encrypted secrets to access key pieces of infrastructure programmatically using GitHub Actions and how you can greatly improve the security of your GitHub workflows. As GitHub Actions has matured as a product and more and more companies rely on it for their CI/CD workflows, an exposed secret in the repository can be the difference between a team being able to efficiently test/deploy code, and an infrastructure penetration nightmare. Allow your devops engineering team to sleep easier at night by replacing GitHub secret management with Teleport machineID, allowing your GitHub Actions workflow to manage infrastructure resources like Kubernetes clusters, databases and more, all without having to rely on shared secrets stored in your repository. The talk will include an overview of existing vulnerabilities, how Teleport MachineID rethinks automated infrastructure access, as well as a demo showing how you can control your Kubernetes clusters using GHA workflows without secrets while keeping a full audit trail and control of your workflow. Leave shared secrets behind in 2022 and enter the new year with Teleport MachineID, giving you, your engineers, and customers peace of mind.

Summary

  • Kenneth Dumez: Why you should never use static shared secrets in GitHub actions. GitHub Actions allows you to centralize all of your integration and development testing workflows. Most organizations are just not equipped to deal with these leaks.
  • Using a solution like teleport machine id for GitHub actions. Instead of managing your access using long live credentials, you can instead use automated short lived certificates. This will make sure that you don't have to use any static credentials in your CI CD workflows.
  • teleport machineid can integrate with your GitHub actions workflows. It can reduce the risk of a hack through a leaked credential. Check us out on Slack at teleport slack. com. We have an open source and an enterprise version.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
You hi, my name is Kenneth Dumez and I'm a developer relations engineers here at Teleport. Thank you so much for taking the time to come listen to my talk. We've got so many good ones at the conference this year, and I really urge you to check them all out if you get the chance. The folks from Comp 42 really know how to get a group together, so today I hope I can teach you a couple things about securing your automated workflows, how the landscape looks right now, and why it's probably a bad idea to use long lived static credentials in your various CI CD flows. But today we're going to focus especially in on GitHub actions, hence the name of the talk, why you should never use static shared secrets in GitHub actions so many of you are probably familiar with these two logos, if not the one on the right, certainly. If not the one on the right, certainly the one on the left. Gotta love that strange little octacat guy that the GitHub folks have conjured up. The logo on the right, if you're not familiar, is for their CI CD solution, GitHub Actions GitHub Actions is great because it allows you to centralize all of your integration and development testing workflows in the same place as you keep the code you're testing. So that way there's no need for a bunch of other repositories floating around with different workplace configuration files and et cetera. There's no need for separate DevOps repos. You get this nifty little UI where you can see all of your test runs. You can also click into individual runs and see all of your details. It's really a great tool for managing your development lifecycle in a pretty intuitive manner. The GitHub actions config files themselves are also pretty simple. It's easy to get started. It's really just a great solution without overcomplicating things and a lot of minimal overhead. This is just a little example from one of our repos at teleport from the instruct labs that we have, and I'm certainly not alone in my opinion on the tool. This is data from HG insights that shows the adoption of GitHub actions by companies in the last year. As the product has matured, its user base has also grown wildly and still continues to grow. When GitHub Actions first came out, it was a little bit rough around the edges, but now as it matures, adoption has skyrocketed and this is also only tracking enterprise organizations and doesn't account for the thousands of open source product that are also relying on GitHub actions for their CI CD needs. As you can see, over the last twelve months there's been a 71.88% increase in companies using GitHub actions. This brings the current running total of companies that HG insights tracks to 9406. It's a lot of companies. And if you've seen some of my other talks, you know I love the Git Guardian State of Secret Sprawl report. This is the most recent numbers from their 2022 report. Looking back on the past year, I really do love this report because it really illustrates how big the problem with secrets, especially in GitHub, is. You would think by now we as an industry would start adapting our practices a little bit and being more careful with how we manage credentials. But no, the problem is actually getting worse. 6 million secrets were leaked in 2021. That was double that of 2020. Part of this has to do with the increased amount of companies moving their infrastructure from more traditional on prem setups moving over to the cloud. As there are more cloud resources, of course there's going to be more credentials requiring to access them, different access tokens, API keys, long live passwords, you name it. And frankly, most organizations are just not equipped to deal with these leaks. Another quote from the report is that on average, in 2021, a typical company with 400 developers and four appsec engineers would discover 1050 unique secrets leaked upon scanning its repositories and commits. And each of these secrets is typically not leaked in an isolated way in just one place. On average, each of these individual secrets appeared 13 different times per secrets in different places across the code base. Accounting for all this duplication across the code base. This means that a single appsec engineer, on average, annually needs to handle 3413 secrets, on average. But that said, this is simply not sustainable. Those poor appsec engineers need a break. So there's a couple different solutions to this problem. How do we deal with credentials? How do we deal with secrets in our repos? One of the purported solutions is of course, just to use GitHub's encrypted secrets. These are pretty good. Everything is encrypted on the client side and then decrypted on runtime, so the secret can actually just be injected into the workflow. And GitHub actually does use a mechanism that attempts to redact any secrets that appear in run logs or get exposed in other ways. However, because during the runs there are multiple ways secret values can be mutated and transformed, accidental exposure does happen another problem is dynamic access, dynamic credential exposure. For example, say you're using a private key to generate assigned JWT token to access a web API. Unless you register that JWT as a secret in GitHub, it won't be redacted and can be exposed in logs and standard out and standard error anywhere that that is being printed. Another issue is chain of custody. So this is important because any user with write access to your repository has read access to all the secrets configured in your repo. This makes it very difficult to audit and keep track of who is accessing your resources, at what time, and who is doing what with your various secrets. This is increasingly a bigger problem at scale. It could be easier if you have three, four engineers, but then once you have that example prior where there's 400, it's a lot to keep track of. There's also this issue of duplication. So in an ideal world, of course, the secrets you are using in your GitHub actions, repo would only live there and there alone. However, a common setup that I've seen in the past is that these secrets will actually be duplicated across various places in your infrastructure. They might be stored in a password vault, for example, as well as in the GitHub repository. This is really useful for an engineer, because if you wanted to manually access a resource, you'll have the credentials at hand. They're right in the vault, you can look them up and go from there. They're not hidden behind this GitHub encryption. The problem is though is that now you have these credentials floating around in a few different places. This makes it very difficult, for example, to rotate these creds. Say if an engineer leaves, a new one joins or a credential gets compromised. You need to rotate that. You need to now track down all the different places that you're using this credential and rotate it in every single one of these. It becomes a lot very quickly. This also expands the attack surface that would allow malicious actors to take advantage of these credentials. The more places that you have these secrets stored, the less secure they are, leading to more chances for mistakes and compromising developer efficiency. Whenever secrets are added, removed, or need to be rotated, another avenue is saying, okay, so we know that secrets are probably going to be leaked at some point, so we should constantly be monitoring our repositories for those creds so we can respond as quickly as possible to leaks. This is where the monitoring and scanning solutions come in, just like Gitguardian. So these tools are great and not mutually exclusive with using, say, encrypted secrets when you can, but really they're just not quite enough. They're more of a reactive solution that you can use to do damage control rather than preventing the problem at the resources, which is kind of always the end goal to make sure that the problem doesn't happen in the beginning. They also often require manual intervention. So, say, when a scan picks up a security leak and a secret gets out there, a security engineer may be pinged and he'll have to put down dinner with their family. Then go rotate that cred in the password vault and delete it the 13 times it appears in the leaked code base. And for anyone that's had to delete old commit history from GitHub and absolutely sift through that huge tree and try to repair it, they know how difficult that it can can. It's a real mess to delete and overwrite GitHub history to make sure that commit is fully, fully gone. So again, these tools are great, these scanning and monitoring tools, but they just don't go far enough, and they certainly aren't enough by themselves. So this leads to the question of, well, so what can we do about this? What can we do about all of our secrets? What if we simply removed the long lived credentials? Keeping long lived credentials safe is hard. It's really, really difficult. So the reality is that as long as they exist, no matter everyone's best intentions, to follow best security practice guidelines, always encrypt those secrets, make sure they're rotated, don't leak anything, is that humans are human, right? They will eventually make a mistake. And when they do, if it's not properly handled immediately, there could be huge repercussions. We're talking customer data leaks. We're talking bitcoin miners in all of your infrastructure costs shooting up to millions of dollars, et cetera. And you might stop 99 out of 100 of those leaks, maybe 999 out of 1000. But eventually one of those secrets is going to make it into a paste bin file somewhere on the dark web that some kid in Brussels is going to sell to buy some NFTs or whatever hacker teens in Brussels do, it's not going to be good. So one of the ways that we can actually eliminate these long lived credentials is by using a solution like teleport machine id for GitHub actions. In teleport eleven, one of our most recent releases, we actually added support for GitHub actions workflows. So with teleport machineid, if instead of managing your access using long live credentials, you can just join each infrastructure resource to your teleport cluster and instead use automated short lived certificates. There's no credentials to manage, there's a rich audit log of everything happening in your CI CD environments, and you have that chain of custody even for your automated worker nodes. So this is kind of a higher level architecture diagram showing how teleport machineid can interact. For the Kubernetes cluster, the worker node will actually refresh its credentials on a cadence, getting a new kubeconfig from the teleport host, renewing its access in an automated, secure fashion. So in this instance, this machineid worker node does not actually use any persistent credentials. It has these short lived certs that it renews from the teleport host, making sure that there's no secrets to manage, there's nothing to jumble. And this will actually interact with your GitHub actions workflows to make sure that you don't have to use any static credentials in your CI CD workflows. So let's check it out, do a little demo, see it in action. Cool. So first what we're going to need to do is to create a join token. These tokens set out criteria by which the auth server decides whether or not to allow a bot or node to join. To create a token, we can write the resources yaml to a file on disk, and then use the teleport CLI control tool tcuttle to apply it. Let's take a look at our token here. So we have our token, it's pretty simple. We have the name Comp 42 GitHub token. We have the when it expires, which I just set to the year 2100 for this example. It's going to be around for a while, but you can set this arbitrarily as you'd like. And then we have the spec which contains things like the role. The role defines which roles that this token will grant access to. The value of the bot states that this token grants access to a machine id bot. Then we have cube, which specifies that it will allow the bot to interact with Kubernetes resources. We have the join method bot name, which is the name of the bot, and then the GitHub section. This will be our repo that we'll be running our actions from. I just spun up a quick little demo repo for the purposes of this example just in my personal repository. Dumez Conf 42 demo. Next, we'll actually create the token resource using that cli utility I was talking about earlier called tcuttle in order to create the token, we'll run tcuttle create. This command will take in the config yaml and product a resource on our cluster. Great. Now let's just check to make sure that the token was created successfully. We can do that by running tcuttle tokens ls as you can see, we have the name of our token here conf 42 GitHub token with the expected type bot and cube. Perfect. Next thing, what we're going to do is actually create our bot. This will be the bot that will be running all of their companies triggered by our GitHub actions workflow. The machine id bot created in this example will be used to access a specific node on the cluster via tsh Ssh. Teleport's Ssh utility, and will therefore require a role that can access the cluster as needed. This example configuration will apply the access role. However, care should be taken to either create or apply a role according to the principle of least privilege in production environments. For this demo, it doesn't matter as much, but if you're using this in production, you always want the role to have the least privileges possible. Additionally, it should have explicit access to the cluster using a username created specifically for the bot user alone, and do not share this username with any other use case. So here we have our command tcuttle bots add comp 42 demo the name of our bot, we give it the roles access and we input the token here. We also give it the login Ubuntu. Again, in a production environment, you're going to want a specific user for this. Great, so it looks like it worked. Let's just check to make sure that the bot was successfully created. In order to do this, we can use tcuttle bots ls and here we have our demo bot.com 42 demo and you can see that it has the correct user and the correct roles. Awesome. Now let's take a look at this example GitHub actions workflow I created earlier. Great. This workflow leverages two existing teleport actions, which first install teleport on the actions runner. Then we'll authorize the runner by fetching the machine ID credentials from our bot. Then we'll list the remote ssh nodes we'll have access to on the cluster, and finally we'll ssh onto one of our nodes and actually write the GitHub commit Shaw that triggered the workflow to a file on the ssh code. As you can see, we have this Tsh command that will then echo the GitHub commit Shaw to this file called GitHub run logs. Perfect. So now let's go ahead and actually commit this action and see it go see it in action. We're going to say demo add conf 42 demo action. So now let's go ahead and actually push our commit push this workflow up to our repo to see it in action. So we've run the git push, we have our bot. Now let's go check on our actions page. We can see the action being run. This was triggered because of the push to main. Now let's just give it a second and we should be able to see this action actually run using the GitHub actions runner using the machine id bot with its credentials. And as you can see here, we don't have any secrets or any long lived credentials that we're using. All we're using is this short lived certificate that is produced by the machine id. Awesome. And our job ran successfully. So now if we log into our cluster here, we should be able to see this is the teleport UI here. And this actually allows us to interact with our teleport cluster and check on all of our activity. So if we go to the audit log, we can actually see the certificate issued and we can see what the bot was doing. So we can see that it started the session and it actually executed a command on the node, Kate's host. And then we can see the exact commit shot and exactly what was run on this worker node. Just like that, we're able to manipulate our teleport resources, our resources managed by the teleport cluster, all without using any long lived static credentials stored in our GitHub. There's nothing to be leaked. And all of this is fully extensible and fully configurable for your various needs. So you can do a lot of different things with this. You can manipulate SSH nodes, you can manipulate databases, you can even manipulate kubernetes clusters, whatever kind of resources that you have, you can use teleport to manage your GitHub actions. Great. So that's a little bit in a nutshell about how teleport machineid can integrate with your GitHub actions workflows to secure your CI CD pipelines and reduce the risk of a hack through a leaked credential, completely eliminating static credentials from your workflows and making it so that your engineers can sleep a little bit easier at night. Thank you so much. I hope you learned a little bit from this talk, and you should go out there right now and secure your GitHub actions. Check us out on Slack at teleport slack.com. We have a great community there. I'm hanging out there all the time so we can chat. And if you want to learn more, just go to teleport.com. Remember, we have an open source and an enterprise version, so if you want to download it and just hack around with it and try it out for yourself, feel free to do so. It's a lot of fun. Thank you again so much and I hope I see you at the next talk.
...

Kenneth DuMez

Developer Relations Engineer @ Teleport

Kenneth DuMez's LinkedIn account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways