Conf42 Site Reliability Engineering 2021 - Online

Keys or Certs for SSH Access? Why should I care?

Video size:

Abstract

Today’s software developers, DevOps teams, SRE’s, and SysAdmins are familiar with the concept of public-key cryptography for gaining access to remote resources but using public and private keys has its limitations and can be difficult to scale and manage. Consider leveraging short-term certificates for SSH access over keys and rest easy!

This talk will go over the pros and cons of using keys vs certificates and how to get started using short-term certificates!

Summary

  • You can enable your DevOps for reliability with chaos native. Today we're going to talk about keys or certificates for SSH access. This is how we've been accessing resources for years, and it's still very popular.
  • For years we've been using PKA for accessing resources. It's not overly difficult to find a system that does not support public key authentication today. Why change from keys to certificates for authentication? If the system isn't broke, why change it up?
  • What happens when your user moves on to a new role? What about when a device is potentially compromised or stolen? Laptops. With all of us working remotely, working distributed laptops are powerful enough. 15,000 keys that are now out there that you're having to manage.
  • Someone accidentally commits a private key to their public repo. Within minutes this can be utilized to log into a service and cause chaos. The responsibility sits on you as an SRE to handle this and manage this. Scaling out deployments can be a wee bit challenging.
  • Open SSH, which is pretty much the de facto standard for SSH certificates or SSH period, had support, added support back in version 5.4. Companies today already using certificate authentication, Netflix, open source, the bless protocol, they're open source using certificates. There is a little bit of a learning curve, but it does pay off in the long run.
  • certificates require a certificate authority to own the public private key pairs to generate those certificates. If a certificate is tampered with, it breaks that signature and invalidates that certificate. certificates can be set to expire. You can actually set a certificate to be like two weeks old.
  • demo shows how to use SSH certificates without using standard username passwords or a public private key. This is a demo repo, this is not something you want to run in production. It's designed to help you understand how SSH certificates work in a little bit more detail.
  • On top of that, we are hiring. If you're interested in working for a series B funded, fast growing startup, love to have you apply with the link there. 100 employees, we're fully distributed working on open source. Anyways, that's all I've got today.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Are you an SRE, a developer, a quality engineer who wants to tackle the challenge of improving reliability in your DevOps? You can enable your DevOps for reliability with chaos native. Create your free account at Chaos native Litmus cloud hello and welcome to today's session here at Comp 42. My name is Allen Vailliencourt. I'm a sales engineer with teleport. Today we're going to talk about keys or certificates for SSH access. And why should I care, especially if you're coming from the SRE world? So let's jump right into it. So this is probably what you're used to seeing if you're doing any kind of Linux systems administration work along that line. This is probably a very familiar screen to you. You fire up your local terminal and you see a bunch of publickey private keys, or maybe even this when you're accessing a remote resources, right? You're using SSH, what's your identity file, what user? And then you're logging in and just kind of moving on. This is how we've been accessing resources for years, and it's still very popular. It's not going away anytime soon. So continuing on this, what about your servers? How long is your authorized keys list? Have you taken time to go look at one of your production servers and do a can of that authorized keys list? And you might be surprised and say, whoa, there's a bunch of entries on here. So the big question is, do you even know which of those keys are valid? So let's look at publickey authentication. Let's look at a few of the pros around it. Why do we have it today? We're not going to go into a deep history of it, but just a high level. For years we've been using PKA for accessing resources. It's not going away anytime soon. I mean today to gain access to AWS, GCP, probably your routers, your switches, your Linux servers, wherever they might live, you're leveraging. More likely than not, something along the line of public key infrastructure using with OpenSSh. The original intent behind it was better security. So with a good PK system in place, users are not having to worry about postit notes and passwords. They're not having to worry about long complicated passwords. Back in the day, for us old salts, you were probably member something like our host, right? And accessing remote resources that way, it's come a long way, and then having passwords is one way. And then when keys, publickey, private keys started coming out, it just simplified, especially from a systems administration standpoint. And with PKA, you can even automate your processes todays. If you're using CI, CD, Jenkins, BitBucket, bamboo, Ansible, Circle, CI, whatever, terraform, GitHub, you can leverage public private keys and even that kind of thought process to access those resources. In fact, most of today's modern services like GitHub or BitBucket or GitLab do not necessarily recommend using passwords to authenticate in order to push your code up. They'd rather you generate a public private key in order to do that, and they have full support for that. So keys are super easy to create and deploy. We've been doing it for years, SsH, keygen, T and then ED 2519. That's the encryption cryptography standard that I use. And you might use RSA or one of the others, and that's fine. This is not the webinar for talking about that, but for generating a key, it's really easy to do. So that brings in line, right? It's not overly difficult to find a system that does not support public key authentication today. So the question is, right, keys are superior. Change my mind. That's kind of what we're thinking many times, right? Why change from keys to certificates for authentication? What's the reasoning behind this? If the system isn't broke, why change it up? Well, we're going to talk a little bit about that and then we'll jump into high level demo of how you can get started today. So a few cons that we can say that exist around PKA comes along this line. Right? What happens when your user moves on to a new role? Maybe they move on to a new job, change departments. Maybe they were part of your SRES team, Sysadmins team, security team. But then they went to developers or product marketing and they had access to all these resources through public private keys, but now they're no longer there. So what did you do? Or have you done anything? What about when a device is potentially compromised or stolen? Laptops. With all of us, many of us working remotely, working distributed laptops are powerful enough that most developers, people in this world, because we're so remote and traveling, things like that, that we have laptops. And if it doesn't have local encryption turned on and it gets compromised or stolen, what then? So this interesting article from ssh.com talks a little bit about the spread and the growth of keys out there. Just in this piece, talking about one customer, financial sector, 3 million keys, 750,000 distinct key pairs from 15,000 servers and for large environments, that's probably on par for the norm. Maybe in your environment you could probably start calculating out. You're probably like, you know, we've got quite a few. And as an example you think, well, maybe we're just small. There's only 1020, 30 of us. We only have like 50 servers and there's not that much. Well, do the math and you can realize you have 50 servers times 30 developers, that's what, 15,000 keys that are now out there that you're having to manage and rotate or not rotate for accessing your resources. So continue on. So what about when someone accidentally commits a private key to their public repo, right? Within minutes this can be utilized to log into a service and cause chaos. Let me pull a story. Time, ladies and gentlemen, is years ago at a place I worked at, we had a gentleman that was one of our new DevOps engineers, SRes. And then for this person coming from a traditional world, they weren't used to using git, GitHub and committing stuff out there. So they had their private key for AWS and they accidentally committed it to their repo. And that repo wasn't set for private. It was a public repo out there for testing. Well, guess what happened? As you can imagine already bots were able to scan that within minutes. And within hours we had over 200 servers being spun up on AWS data centers all over the world. So we got the email alerts from AWS. So immediately a bunch of us jumped on. We started shutting down, deleting all these servers. We nuked that key so that way it could not be reused again. And we had a good post mortem about having a good git ignore file and on top of that, don't committing your private keys. But this happens. And today in some of these services they actually have scanning tools. If you try it today, I think within GitHub and other places you'll probably get an email really quickly because their system will scan it and say, hey, it looks like you have a secret or a key or something publickey. So you might want to check on that. But still, the nuance of it still depends. The responsibility sits on you as an SRE to handle this and manage this. The other big thing, right? Scaling out deployments can be a wee bit challenging, right? Businesses are using homegrown methods to rotate keys or maybe commercial open resources vaults to manage this. I talk to a lot of customers on a weekly basis that they've grown and it was fine when they're small, but now they're hiring as a lot of sectors are hiring a lot and they're rapidly scaling out their virtual infrastructure. And on top of that they're having issues like what do we do now, right, how we manage this. Maybe we had a cron job or a bash script or an ansible playbook. And then it gets really complicated place I used to work at another job, we only had eight or nine developers and I had like 80, 90 servers. So I wrote an ansible playbook. So I would get all their public keys, I'd message them on slack or email, they'd send it to me, and then I'd run my playbook and update all my servers. Whether or not they needed access to it or not, they were devs and it was just easier to just put their public key on all the servers and just kind of go from there. But when that person moved on or that project moved on, me, with everything else going on, not necessarily having a lot of time to go back and clean up those keys and definitely pose some challenges there. You know another big thing, right? Keys don't expire. Unlike other things out there, keys typically are not going to expire. So that brings in line that troubleshoot that if it's out there, it could be out there a couple of years later and still leverage and it's still valid and it can still work. So Hackerman over there, give me some keys, he's going to have a good time with it. Just one of those cons to think about. So let's segue a little bit to certificates. Did you know that open SSH, which is pretty much the de facto standard for SSH certificates or SSH period, had support, added support back in version 5.4? And this is actually from the release notes, talks about SSH certificates, what they're made of, how you would kind of generate them. So here's the big kicker, I think. Look at there at the bottom. It was released in 2010, eleven years ago. So let's let that sink in. So we've had the ability to use SSH certificates instead of public private keys or in lieu of them or in conjunction with them for eleven years. So now you're thinking, I'm intrigued. Or maybe you're like, why am I just now hearing about certificate authentication and with all the problems, and I have that in quotes because they're not necessarily problems, they're things that I think in the sres world we deal with a lot and it's just kind of par for the course and we just move on. So certificates, we do use them all the time. You've been using them for years, thanks to Google and let's encrypt and other companies that have made HTTPs a web standard today. And guess what? Five, six, seven years ago, a majority of the sites you hit out there probably were not HTTPs enabled. It was only those ecommerce sres on that final care checkout when you put in your credit card information. But now, if you hit most websites across the web, you're going to get HTTPs. In fact, most of the time, if you hit a site that does not have that, you'll get a warning from Chrome or one of the other modern browsers. So the industry has kind of migrated and using HTTPs as a de facto standard, which is all certificate based authentication or authorization and verification on the web. So from an SSH perspective, we're not using HTTPs certificates, not using SSH certificates, we're using SSH. So there is a little bit of a learning curve, as you'll see as we kind of dive into this. But I believe once you get over that, you'll realize that it does pay off in the long run. Companies today already using certificate authentication, Netflix, open source, the bless protocol, they're open source using certificates. Lyft has their fork of it, and there's a number of them out there. And these companies have the developer staff, right? We're thankful for Netflix and Lyft, that open source, some of their big projects that become standardized across the board, but they have large engineering teams, large development teams, and they're software companies. Whereas your organization might not have that expertise in house to build and run or develop something like that. Or also there's a general lack of understanding knowledge around certificate authentication. And traditionally there's also been a lack of good tooling around, provisioning around, storing around auditing and rotating of certificates. So when you wrap all those together, you're like, that's too complicated. I'm just going to stick with what I know, which is keys, and just kind of move on from there. So let's look at a few of what I would call pros of using something like SSH certificates. We have a usability improvement, and part of that is this message that we get when we're using into a system that we're not familiar with or we have. Right. We log in and you get this warning and we're like, do I want to connect? Yes. No. And guess what? We just continue on and we just kind of ignore it and go from there. So there's can operability improvement. When you leverage something like Ssh keys on that line, you get host key verification, you get key distribution, things that help with from a certificate piece. Then there's a security improvement as well with certificates, one you don't have to worry about permanent keys out there. And on top of that, you get the ability, as you'll see, to have things like some metadata, as well as having, what do you call, excuse me, expiration dates, things like that to help you with those certificates. So let's look at it in an image. Certificates, SSH certificates in an image. So we have here is we've got a valid principles, we've got keys, we got a signature piece of the puzzle there, and we have things that make it encrypted. And what that does is it makes it so that way when a key is being leveraged, I mean, a certificate is being leveraged, you have the stuff there. So a signature. So if you have a signature there and someone tampers with a key, guess what? That signature gets invalid. If that key gets tampered with and it gets broken, we have a valid after and a valid before date. So you can set dates on certificates so that they only operate within a certain amount of time. And then what kind of certificate, whether it's a user certificate. So MIA is a user logging into a system or maybe a host certificate, which would be your web server, your application server, whatever server you're trying to gain access to, and then some other expenses and then valid principles and then a few others. So I have a link there to a blog where we talk about this a little bit more in detail so you can see some of that. So let's dig a little bit deeper on this. So this part we're going to start kind of peeling the layers back, kind of walk you through of how we're building this out. Then we're going to jump into a quick demo and show how you can even do this today. So, certificates require a certificate authority to own the public private key pairs to generate those certificates. So you need to have a CA and you can roll your own. And as we'll do here in today's demo, from a cryptography standpoint, we're really not changing anything. We're not adding anything different. We're just validating and we're signing those keys across the board. If a certificate is tampered with, it breaks that signature and invalidates that certificate. So that signature gets broken, that cert is invalid. And guess what? Now your connectivity to that system is now denied. And as I mentioned, once before, and you'll hear me probably mentioned a few more times, certificates can be set to expire. This is probably one of my favorite features about using SSH certificates is the fact that when one is issued, you know that there's only a certain time to live for that, and once it's done, you have to reissue a new one in order to continue accessing your systems. And of course from a security, maybe even an SRE standpoint, using a shorter time to live on a cert hopefully equals your security team sleeping a little bit better at night, not worrying about these keys that are out there, host certs which are used to identify hosts. It's that they say who they say they are, and then we have user certificates which care used to identify the user, that the user is who they say they are as well. So let's continue to break this down and start showing you some code and how it works. So what we're going to do first is what you'd need to do first is generate that host and user certificate authority. So what type. As I mentioned, I'm using Ed 2519, then the file name. So I'm going to, hey, write this as host can, user ca, whatever it is, a comment so you can have a little bit more hey, this is a host CA user CA. Now we're going to generate a host key and then sign it. Then we're also going to generate a user key and sign it. So I generate my host key again using 25519 for what type my file name, what's going to be called, and then a passphrase, which is optional if you want to put in a passphrase. And then we're going to create and sign the host certificate based off that key. So again, still using ssh keygen. So what I'm doing is the host file name of that ca private key. So I generated that ca private key, so I'm going to use that to basically sign it with my I is my cert's identity. So this is just more of for logs and things like that. You know what the cert identifies as. The h is for a host certificate. The dash n is our comma separated list of principles, which would be from a host side, maybe your fully qualified domain name. So you can see an example. I've got app example, localhost, app app node. The v is a time to live. And we'll talk about that here in a little bit more. What it is. For this demo we got like a plus 2 hours from a host certificate. So after 2 hours from this creation this certificate will expire. And then we're tying in what that public key was and it's going to export out that certificate. So let's go ahead and flip it and we're going to do the same thing with a user certificate. So now we got a host, one created for our host. We're going to do the same thing for a user. Pretty much looks pretty much the same except it doesn't have that h because we're not doing a host certificate. And my time to live is a little bit shorter because I want user certificates just to be a little bit shorter. But other than that I'm using my user ca to sign it. My identity, hey, this is an app or whatever, my identity, username, email, whatever my dash n, which is my principles would be something like linux login name. So if you're sshing into a system, if it's like ubuntu or EC, two user using AWS or whatever your name is, that would be your list of login names that you're allowed to ssh in as the v is your time to live. And then of course going back here is what you're signing up. So talk a little bit about that v part. So you can actually set a certificate to be like two weeks ago up until two weeks from now in the documentation of Openssh. Or if you go someplace like explain shell and you look at that v and read the man notes on it, there's a really whole host of options that you can have for plus 30. You can do a -30 plus 30. So it's valid from 30 minutes before until 30 minutes after. You can put specific dates. It can get really complicated really fast. But as you look in your environment, take a look and you architect and design and plan it how you need to have it work with your systems. Let's continue breaking down. So now let's go view those certificates. So you think, what happened now? Well, you look on your system, you're going to have a bunch of files and you're going to see that pub file gets appended with a CRT. So it's the name of the file. But if you look at that CRT file with the ssh keygen l option, you're going to see a certificate. It's going to display on your browser. You're going to see everything we kind of talked about. You got a host certificate there, you've got the key id, you get the principles and then you have a valid expiration type along that line, which is really kind of cool being able to say, oh, that's pretty neat. You can just view that and see how valid it is. All right, so let's jump quickly into the demo piece of the puzzle here and show you what it's going to look like and how you can even get started today as well. So let me switch over here to my terminal here. All righty, so in this, I've got a GitHub project that is up and running and the links are in this repository in this demo at the end that you can pull off GitHub in order to pull down. So if I look, I've got a couple files, I got a Docker file. So what we're going to do is we're going to run this out of Docker and you're going to be able to see standing up two docker images. One is called an app node that we're using to ssh to. The other is a bastion. So over here we can see, I have no images at all. So I'm going to go ahead and build this using Docker compose. So let's go ahead and build this out and give me a minute or two here doing that. So when you pull this file, I've got a readme out there. There's two branches on this repository, the main branch, and then I've got a 30 minutes branch, which is the demo I'm using for this. So feel free to switch to that, but you can dig a little bit more. And what those docker files are, there is some SSH configuration that you would need to also run within your environment to set the SSHD configurations for me on the containers. All right, so we've got a build. So let's go look at it. We've got an app node and we have a bastion node. And so our goal is to ssh to the application node from the bastion node or from my local MacBook here without using standard username passwords or a public private key. So now that's up and running. So I'm going to start these containers. I'm going to go to docker compose up, going to detach it. So now we're going to see here my systems are running. We can see that one is listening on port two, two. One is listening on port two, two, three. So we're going to do a docker logs. I'm going to follow both of these systems here so we can see what's happening. We can kind of see in real time what we're going to do. So what we have here is the application node, which is our end node that we want to get to is we're using to eventually ssh into, we're going to do it through a jump host, through a bastion host. So now that I have my system up, I've got a script here called copy keys. And what this script is going to do, it's going to copy down my certificate authority information, put it in my known host, it's going to copy my certs and then it's going to build out my ssh config that I'm going to need to access these systems here. So I'm going to go ahead and run this. And again, this is a demo repo, this is not something you want to run in production. So let me just caveat that I use this just for learning and experiment. So please. Security is probably not the best on this, but it's really designed to help you as a user understand how SSH certificates work in a little bit more detail. So now we've got, everything's been copied locally and if I actually look in my known host file and actually can my known host file here, let me do that real quick. Oh, ssh known host and we're going to grep, I'm going to see, you're going to see that I've got the host ca, so what I've done is added those host certificate authorities into my, and this is part of the using SSH certificates. If you read the documentation on it, it talks about having this. So that way systems know, hey, this is a known host, this is valid. We are good to go. So let's go ahead and CD over to my temp ssh files folder that was created here. And you can see I've got my public private keys and my certificates. Some of this I don't necessarily need, but for the demo let's not worry about it. In fact, if I want to view what that certificate looks like, and you can see that right here, this certificate is valid for about five minutes. So in this demo I changed it. So in but five minutes my certificate is going to expire. This lets me know as a user I can log in as these principles and what actually ssh extensions, I'm allowed to do that. So if I look at my configuration file, you can see I've got a host and a bastion here. And what I'm going to do is I'm going to proxy jump from one into the other. So let's do that and show you how that works. I'm going to do an ssh minus f and going to call my config file and I'm going to log into my application node. So mealy, what happened is you saw some information come across these other screens and you see it says accepted certificate, right? So this accepted and it validated that my certificates was legitimate, it wasn't expired, and I am now within this server. So now I can run and do the work. So I spun all these up just to run dad joke, right? What did the beaver say to the tree? It's been nice gnawing you. So that's as an example showing hey, how we ssh in. You notice I did not get the warning, hey, do you recognize this host? Should we ssh in or not? I'm able to do that really right off the top on that. So let's exit out. So you're thinking what about if? Did it really leverage some of that certificate so I can run this command? And what it is I'm going to ssh in I'm using to add some little verbose debugging information. And you'll see it from my local Ssh that it's using certificates. So it logs in here and it sees hey, this certificate is valid for a certain amount of time and this certificates is valid and how we're accessing and how my host matches this host certificate and how my app node also matches that host certificate. So it's leveraging those handshaking with them on the back end and saying hey, we're connected, we're authenticated, we're able to access those systems and we're good to go on that. So let's look at the bastion certificate as well and see what kind of time to live did we have on that one. This one is a 15 minutes time to live. So that certificate will still be valid for another 15 minutes before it expires, whereas my client certificate should be expired fairly shortly. So at the time I'm recording time of this video 403. So let us take a look and see if it will deny me and let me into the system once the certificates expires. And so boom, here's what happened. I just tried. So, you know, a few minutes ago I was able to log in, but now that certificate expired, so guess what happens? I'm not able to log into that system anymore, which is super awesome because I don't have to worry about a public key that's sitting out there. That certificate expired. So when that happens, it talks back to the system, says, sorry Alan, you're not allowed in because your certificate has been basically revoked, it's no longer valid. You have to issue a new one in order to gain access into your systems. And to me, that's the power of leveraging SSH certificates over keys is being able to control and gate some of that access across the board. So as we kind of finish up here and highlighting some of that, how do you get started? You're probably wondering, that was awesome, hopefully. So if not, that's okay too. There's a link to the GitHub repo where you can pull this down and you can add that vv for verbosity. It uses Docker, so you can use Podman or something else and just modify, but uses Docker and Docker compose to stand up these consider, and this is just a simple high level overview of how this works. If you want something that is way more complicated and way super cool, go check out teleport. It's a fully open source access plane project. Almost 10,000 GitHub stars on it, used in production by companies all over the world. It is the company I work for, so a little bit more on that. But we take this concept and we extend it and then we expand it so that way users can leverage short term certificates to access resources wherever they might live. And of course there's some great resources there. In fact, this presentation, a lot of my content written by some really smarter folks that I just repurposed for it. So there's two links to how to SSH properly using SSH certificates, as well as a little bit more diving into what they are, where you can find me, you can find me. Shoot me an email, love to connect. Find me on LinkedIn as well, or even on Twitter where I tweet about food and or technology every so often. And of course, as I mentioned, our GitHub for our teleport, we also have can open source slack community. We'd love to have you join in on that and ask questions as you're playing around and learning how this works and you want to just kind of chat about things like that. On top of that, we are hiring. So take this moment. If you're interested in working for a series B funded, fast growing startup, love to have you apply with the link there. 100 employees, we're fully distributed working on open source, so if you've got a passion for open resources and security, give us a shout. Anyways, that's all I've got today. Just want to thank you for attending today's session here at Comp 42 and have a wonderful day.
...

Allen Vailliencourt

Solutions Engineer @ Teleport

Allen Vailliencourt's LinkedIn account Allen Vailliencourt's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways