Conf42 DevSecOps 2022 - Online

Stop Committing Your Secrets - Git Hooks To The Rescue!

Video size:

Abstract

Committing secrets is a huge problem. By the time GitHub, or other services, scans for secrets, it is far too late. The best way to not push secrets is to never commit them. Git provides a clean path for this and this talk will walk you through making Git your ally in keeping secrets safe.

Summary

  • Dwayne: Stop committing your secrets. Get hooks to the rescue here at Comp 42 death sec ops 2022. I work for a company called GitGuardian. We are a security platform mostly concerned with secret detection, secret sprawl detection and remediation.
  • In 2021 we found over 6 million keys or credentials secrets just laying in public repos. About three out of every thousand commits contain something that shouldn't probably be in there. If you touch the software development lifecycle, you are responsible for making sure your secrets don't end up in the wrong place.
  • AWS labs has git secrets. It lets you triple check before you make the commit. Shuffle hog is an open source framework. All of them are straightforward to do. Pre commit hook setup is the only one that threw a little bit tricky at me.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hey everybody, welcome to my talk. Stop committing your secrets. Get hooks to the rescue here at Comp 42 death sec ops 2022. So, I'm Dwayne, I live in Chicago, been a developer advocate since 2016, and you can find me on Twitter at mcDwayne. I'm happy to talk to you about anything we talk about today and other stuff besides, like improv, karaoke, and rock and roll. I work for a company called GitGuardian. They'll come up later in this talk as well. But we are a security platform mostly concerned with secret detection, secret sprawl detection and remediation, as well as infrastructure as code and some other areas as we progressed. Really what we're talking about today is the eternal battle of cat and mouse game, of hackers trying to get at your data and your information. One of the ways that they have to get in is through finding hard coded secrets laying around. And just to put this in perspective, a few case studies here. Uber, they got owned by a teenage hacker earlier in 2022. He phished for initial access rights, then immediately found a lot of hard coded credentials in Powershell scripts. Then he went to the New York Times and said, hey, look what I did. Toyota. They had a secret data server key in a public repo by accident for about five years because a subcontractor pushed something public that shouldn't have been public. 296,000 customers have been affected. Samsung, 160 gigs of stolen data. And when that was pushed out to public, it was discovered that over 6000 keys for API keys, passwords, credentials, over 6000 secrets had been hard coded throughout their code base. No proof that it did lead to a second data breach reported in September. But odds are it didn't help things. AstraZeneca, this is a very recent case. They pushed hard coded credentials to a test environment, and then through user error, they're calling it actual patient data, ended up on that test environment. It's kind of a nexus of a lot of bad things happening at once. The credentials were exposed for over a year. We're not sure at this point in time, as of this recording, exactly the ramifications of this and exactly how many customers were affected. You can have the best security set up in the world and the best security teams, but if you just leave those keys out there, it's pretty easy to get in. And we probably wouldn't do this. That's a little bit of a silly example. Whatever's behind that door can't be that valuable. Like maybe staplers or binders or, I don't know, but while we don't do this, developer advocate do this, and not for malicious reasons, we need to test to see if that API endpoint is up. We need to test if that credential actually works. So we hard code it and if we immediately take it back out then there's no issue. And the problem is though that we do leave these in and by the time we think to take them out, it's far too late. Far too late mean that we've already shared it out there on GitHub or GitLab or wherever our remote servers are. In 2021 we found at GitGuardian we found over 6 million keys or credentials secrets just laying in public repos. This is a huge increase over the previous year. We put this out in a state of secrets for our report. You can read this if you want. The disturbing thing here is that this is actually increased year over year. We're finding about three out of every thousand commits contain something that shouldn't probably be in there, some kind of secret. And this is just in the public. So who's responsible for all this? Well, ultimately everyone is. If you touch the software development lifecycle, you are responsible for making sure your secrets don't end up in the wrong place. And if it's just the security team, we're never going to win this fight, even in the best organizations. Alex ray from Hacker one tells us it's 100 to one. The security team is outnumbered by developers. So we really need to shift it to the developers who are at the forefront of this battle. But do it with a tool they already have access to that they already love. Developers love and hate git, but we use it day in and day out. I say almost all developers there with the asterisks, last time I checked it was like 93.6% of all developers touch git on a daily basis. And git's awesome. But it doesn't make your code more or less secure in and of itself. It's the stupid content tracker. It does what it does exceptionally well. It does give you a way to add some security. Git ignore please use git ignore files. Tell Git to ignore certain types of files or certain paths, and then we can start storing our access keys in places like secrets JSon or AWS file directories, places that we're never going to check into our source control. And if we combine that with things like Hashicorp vault or Azure key vaults, I'm just throwing out two names, not trying to plug either. But then we don't need to hard code secrets anymore and we can prevent them from getting into our source control in a perfect world. That's the end of the talk and we go about our day. But we solve this problem of hard coding secrets and we solve the data that says this is a growing problem, not a shrinking one. So I personally don't believe that. The issue is that we've tested a secret. We have to test secrets. We just have to. Sometimes the problem is that we forget to take them back out and then we push those secrets somewhere and then we have a bad time. In theory you clean remove a secret from a pushed commit, but it's not easy, can be downright painful in a number of ways. One, just physically figuring out where all that secret went to what all branches, but also you got to talk to your team now. You're going to need to rotate keys somewhere. Someone's going to have to stop what they're doing and go deal with this now. And no one's going to look good in that process. Again, painful. What we need is a robot. A robot that reliably, every time we try to commit a secret, just stops us. And Git gives us a way to build that robot. And again, developers love git. So here we go. Git hooks is an automation platform built into Git that I think is wildly underused out there. We can use it for so many awesome things, but at the heart of it git hooks is this. You can build your own contraptions that when git does a thing, it will fire off one of your scripts. That's pretty much it. There are 17 hooks available. Go over to githooks.com, bookmark that. It's a great resource for building all sorts of cool things with GitHub. But the ones we're really concerned about are stopping us from making the commit in the first place. Those are those pre commit hooks. And additionally, from the server perspective, from our git remote perspective, whoever owns that can stop those commits from even getting there in the first place. Because we can start using pre receive hooks to say, well, if a secret is hard coded in here, don't even let it on board, let's just stop it where it is. And git comes with examples for all this. Unfortunately, they're really kind of hard to parse if you're not familiar with git very deeply like revparse, and you don't have the exact use case that Linus, Torval or gitster do to manage a large project. Fortunately, though, scripts are just scripts you can make it do anything. You can write it, whatever language you prefer, whatever scripting language you can access in your environments. This is actually something I do in my personal projects. I make my git logs tell me jokes. So when I do git commits, they spit out a dad joke at me. Props to Ed Thompson for building this into git dad and giving the code open source. So an ideal solution would look like this. If we're building that robot, every time I go to make a commit, every time I go to push that commit around, we should have something check to see if a credential is in there, if a secret is in there. If it is, throw an error and don't let me make that commit. You can build this yourself. Nothing to stop you. If you got enough time and patience. Git grep is a great way to go about it. Grep is awesome. You got to know Regex, but you can make it look for any kind of pattern here. This is what an AWS access key pattern looks like. It's 20 characters long and it contains all caps and nums. You can just make password equals is a pattern you're looking for. And yeah, sure enough, it will catch those things. The problem though is if you build it, then you have to build it and maintain it and deal with false positives and know what allowed and what's not allowed and start dealing with all of the fine tuning of it. And then to spread this to your team, to evangelize your team, good luck. But hey, you're not the only person facing this. There's a lot of other people have already tried building solutions for this and I'm going to talk about a few of them today. Open source to the rescue. Because open source is awesome and everybody should be using open source tools wherever they can. I firmly believe so. There's a lot of them. If you just Google solutions for this, for stop committing hard coded secrets, prevent hard coded secrets, open source, you're going to find a lot of solutions. I'm going to talk about the big three that I think are the big three from my point of view in the world. And there are ways to. Some tools have built in ways to do this through their proprietary offerings, but we're not going to talk about those today. The big three, I think are AWS labs has git secrets. It lets you triple check before you make the commit. Anything that is an AWS looking secret, you can extend it and people have extended it, but it does require a good amount of knowledge of regex and specifically the patterns you want to look for. So if you're using Google Cloud or Azure DevOps, you're going to want to know what those patterns look like pretty intimately to be able to adopt it to your use case. And again, people have extended it, so go out and definitely take a look at it if you're just getting started with this, but that's where it stops. It triple checks those before commit and then you're done. Shuffle hog is a name you probably run into. It is an open source framework, mostly known for their GitHub action integration, but they do have a pre commit hook integration, but it does require you to use the pre commit framework, which is awesome. It's an awesome framework, it's open source, it's cool, but it is required. And also just some people report that it's high false positives. Your mileage may vary. I'm not here to give any judgment about it, just reporting what I have found from my research. I've never actually used truffle hog in production myself. Gitguardian makes GG shield, which can be used at the pre commit level, the pre push level, and the pre receive hook level to make sure you're not passing around those commits if they do get hard coded in. Now, it does require a GitGuardian account, which is a platform you can use for free. For personal and open source use. There is an API limit to that, a thousand calls a month. So if you're doing a lot of commits, this isn't free, but for most people it's going to probably be free, especially if you're working on open source projects. And I like it, partially because it checks for 350 known patterns and you can extend it. But again, extending it means regex fun. So what does this look like in action? I'm not going to go through all of them. They all pretty much look the same at the end. After installing and configuring, all of them are straightforward to do. I must say, pre commit hook setup is the only one that threw a little bit tricky at me. Other than that they're all pretty straightforward to get going, but they all do the same thing. You hard code a secret somewhere, in this case a Yaml file, and I go to commit it. And in this case Gitguardian is the one saying, hey, we found a secret here, and just stops. It fails. And that's the mission. That's what hooks is doing for me every time. I personally don't worry about hard coding secrets anymore, because if I accidentally do, I'm not even going to commit it in the first place. So in conclusion, don't hard code your secrets. If you do end up hardcoding secrets, don't commit those secrets. And the best way to do that is just use some automation and some off the shelf tools. I love open source, and you should too. So I've been Dwayne, I'm a developer advocate, have been since 2016. Hit me up on Twitter for any of the questions you have because, well, this is pre recorded and yeah, you can reach me out there. And thanks again for coming to my talk. Stop committing your secrets. Get hooks to the rescue here at comp 42 devsecops 2022.
...

Dwayne McDaniel

Developer Advocate @ GitGuardian

Dwayne McDaniel's LinkedIn account Dwayne McDaniel's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways