Conf42 DevSecOps 2022 - Online

Who's managing the credential for your data infrastructure?

Video size:

Abstract

If data infrastructure is evolving to a dynamic nature, why are you still using static database credentials? This talk raises some difficult questions to the audience about long-lived static access and provides a tested approach to authenticate, authorize, and audit both user and application access

Summary

  • A national police database of a certain country was leaked. The database contained terabytes of information on a billion people. As software professionals, we might be more interested to learn how this leak happened and how to prevent this.
  • Devon Ahmed is a developer advocate at Ivan, a company that builds and manages your data infrastructure. 80% of data breaches are the result of poor or reused passwords. We'll talk about how dynamic credential can be a solution. The best way to understand something is by a demo.
  • You might have a CI server running somewhere with your database credential. Some unhappy employee left the company six months ago and they might have access to your data infrastructure. This issue is called secret sprawl. Companies are creating their own custom solutions, including custom cryptography.
  • The AAA model is authentication, authorization and auditing. Open source projects like Claw can help to add an audit layer on top of Apache Kafka. With Claw, it's easy to check later who requested what and when the change went live.
  • Hashicorp Vault is an open source secret management tool. Dynamic credentials are generated on demand to provide time bound access. Vault provides encryption as a service so that your data would be encrypted both at rest and in transit. The key to choosing the right tool is flexibility.
  • In this demo we are creating database secrets. We will only be able to read the weekly matrix reporting table and not the employee salary table. We can connect using the admin credentials or using dynamic credentials. This is for demo purposes. Do not use this in production.
  • The demo shows how you can use auditing to see who accessed your resources. Do you have a break glass procedure for data infrastructure security? You might be wanting to do the demo yourself.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
This post is from Shangping Zhao, the CEO of world's largest cryptocurrency trading platform. What he's referring to is one of the largest data each in terms of number of people affected. In the summer of this year, a national police database of a certain country was leaked. This database contained terabytes of information on a billion people. The hacker offered to sell the lot for ten bitcoins, roughly $200,000 at the time, CZ or Shangbang. Zhao's post came in. Four days after the hacker's post, a reporter from Wall Street Journal verified that the leaked information is valid when they started calling people from that leaked database. As software professionals, we might be more interested to learn how this leak happened and how to prevent this. Apparently a developer wrote a piece of blog with valid elasticsearch credentials. This credential was valid for a year and the service could be accessed publicly. If that breach doesn't scare you, and you think that you and your organization are immune to data breach, I'll recommend checking out this website that visualizes largest data breaches across the globe. Welcome to Devsecops 2022 in the security track. I'm Devon Ahmed with the session, who's managing the credentials for your data infrastructure. I'm a developer advocate at Ivan, a company that builds and manages your data infrastructure based on open source technologies. I'm from the beautiful New Brunswick, which is in the east coast of Canada. For the last ten years I have been focusing on application and data infrastructure. In my free time. I'm a pro bono carrier coach. That means I help students and new grads to start or transition into a carrier in tech. Now, when we talk about data infrastructure security, that's a very wide field. There could be issue of physical access. Someone might run away with the disks and then there is nothing you can do. Your data services are running on some physical or virtual machines. If the host machine is corrupted or compromised, then the services that are running will be impacted as well. There could be some SQL injection or insertion attacks from a web application, and any of these three might mean you would have data loss or some backup related issues. Today, I'm not talking about any of these. I'm talking about database access. And that is because 80% of data breaches are the result of poor or reused passwords. Talking about poor or reused passwords in the agenda, we have the problem, the database access. We'll talk about how dynamic credential can be a solution since we don't have any limit on the number of choice we have. I'll talk about the strategy on choosing the right tool. And finally, the best way to understand something is by a demo. So let's talk about secret sprawl. What is secret sprawl? You might have a CI server running somewhere with your database credential. Some unhappy employee left the company six months ago and they might have access to your data infrastructure. You just don't know where you have your passwords. This issue is called secret sprawl. Since we all love to roll our own, you might be inclined to create an encryption as a service, although that might not be your business expertise. So this is another issue where companies are creating their own custom solutions, including custom cryptography, whereas there are tons of useful solutions or products out there. Here's my favorite. Using the same database password since forever. And this is password under the table where everyone sort of knows the password which is using passed around in the office. And if you have passed any credential to an application, that's a bad idea. Applications are not very good at keeping secrets. The moment they have it, they're going to leak it to some audit logs, some sort of outputs, and it creates a disaster all throughout. So with that in mind, let's talk about the protection, the AAA model, which is authentication, authorization and auditing. It's easier to use that AAA model to a product or service that we might be aware of. So let's take Apache Kafka as an example. How does a or authentication works with that? So the idea is we have to be able to properly authenticate Kafka clients to the brokers. And for that there are two mechanisms. One is using SL or secure sockets layer, and the second one is SASL simple authorization service layer. So with SSL the idea is to issue certificates to your clients signed by a certificate authority or a CA. This is the most default or common setup if you're talking to a managed Kafka cluster. The second one which is SASL. There's a term simple within the name, but don't be deceived. It's not that simple. The idea is that the authentication mechanism is separated from the Kafka protocol. It's popular in big data systems and most likely if you have Hadoop setup, you are already leveraging this. Once your kafka client is authenticated, the brokers need to decide what they can or cannot do. This is where authorization come in in the form of ACl or access control list. The idea is pretty simple. User a can't do operation b on resource c from host D. The last a which is auditing. And you can think of Kafka audit logs. The value of audit logs is that they provide data you can use to assess security risks in your kafka clusters. You can also use a number of sync connectors to move your audit log data. Let's say an s three bucket so that you can analyze data. However, there are some challenges to address with ways in which enterprise handle tasks related to Apache KafK configuration. That's why open source projects like Claw can help to add an audit layer on top of Apache Kafka. Claw, as a data governance toolkit can manage topics sels schemas. With Claw, it's easy to check later who requested what and when the change went live thanks to the audit logging feature. If you'd like to know more, check out Clawproject IO. So now that we talked about the problem, let's talk about the solution. What is dynamic credential and how can this be a solution? As the name suggests, dynamic credentials are generated on demand. They did not exist before. This provides time bound access. Let's think of a scenario. You have an engineering team where the engineers need access to your database for 8 hours the time period. They work every day. So at the morning they start their work and they generated a dynamic credentials which is good for the day. Now imagine you have some applications that also talk to your database and an average call is a few seconds and the application generate credentials right before they make the call. Does it make sense to give those applications an eight hour access as well? Probably not. Probably you want those applications to have a few minute of TTL or time to live for their credentials, whereas your human users might have eight or 9 hours of access. Dynamic credentials mean that the calls that your human and machine users are making, they can be audited as well. One thing we're certain that there is no shortage of tools. That's why it's important to talk about the factors when choosing the right tool, the first one being flexibility. Your developers love flexibility. They might be talking to the tool using a UI, ClI or an API call. Your engineering team might have a number of services for cloud providers, some other managed services. You don't want to have different secret management tool for each. So you ideally would like a lot of integrations for the tool you choose. If encryption or cryptography is not your core business strength, you might want the secret management tool to handle the encryption as well. You'd also want the automatic expiry of tokens or secrets to be handled so that you don't have to create business logic for that and in the event of a breach, you would expect the passwords to be able to revoked as well. With that, I propose Hashicorp Vault, which is an open source secret management tool that started at Hashicorp as an open source project. With Hashicorp you can interact with a tool using a CLI UI or an HTTP API call. There are a number of providers, including different cloud providers or database, so that you can generate dynamic credentials. Besides dynamic credentials, you can store long lived credentials within vault as well. Vault provides encryption as a service so that your data would be encrypted both at rest and in transit. You can manage or revoke leases on secrets. If you have experience handling x aiven hundred and nine certificates yourself, you know how much painful that process is. So Vault can actually take away that burden from you and it can act as a root certificate authority or an intermediate certificate authority. Depends on how you set it up. Finally, a number of customers have used Hashicorp vault in production and still they are. It's safe to say that vault has been battle tested. So that's one more factor to consider when you choose your secret management tool. Now let's take a look at a setup where there is no secret management tool and you're using static password to communicate to the database directly. How would that interaction look? So you tell the database, here's my static database password, give me some data. And the database would say, yep, that looks okay, here's your data. Now let's take a look at a setup where you have something like Vault in the middle. At first you'd have to authenticate to Vault. Vault would talk to the provider. In this example, this is the database and verify that your user credential is valid based on the ACL or access control list that is set up. Vault would generate a dynamic credential for you that would be time bound and you, a human or an application would be talking to the database using that dynamic credential. Let's take a look at Vault's architecture so that we can better understand the demo. There's a clear separation of components here. Vault's encryption layer, referred to as the barrier, is responsible for encrypting and decrypting vault data. The HTTP API and storage backend, they're outside of the barrier and that's why they're untrusted. Data that is written to the backend is encrypted. In order to make any request to vault, the vault has to be unsealed. Following an unsealing process, when the vault is unsealed, a user can make request to vault and each request has to have a valid token that is handled by the auth method. The token store is responsible for actually generating the tokens and you have policy store to configure and see the proper authorization is in place. You have different secret engines for cloud providers or databases to generate these dynamic credentials. You can configure one or more audit device for doing auditing, but we can only understand so much by looking at an architecture diagram. So let's dive in to the demo. Here we have the Aiven console where we can create PostgreSQL or any other data related services using the Aiven console, Aiven CLI or terraform provider. Now I'm choosing Google Cloud North America Northeast one region and I'm using a specific startup plan. But for your business case you might choose a higher plan. I'm going to the service to see the service creation. You can see that under connection information I have the URI, the hostname, port name and other connection details. The blinking blue indicates that a VM is being provisioned. For the sake of time, I'll fast forward the service creation. A solid green indicates that my service is up and running and now I can use a postgreSQL client to connect to the service. I'm using the PostgreSQL admin credentials here, and by the time you're watching this video, these credentials are already invalid. All right, so now that we are connected to our PostgreSQL service, let's start by creating two tables. The first table is weekly metrics reporting table. This has information like product downloads, GitHub stars, Twitter followers, so all public information. The second table is employee salary table. This would have sensitive data on our employees. So imagine an application that is supposed to be able to read the weekly matrix reporting table, but should not have read access to the employee salary table. So let's now add some sample data to both of these tables. After that, we'll configure hashicor vault to generate dynamic credentials so that we can only read the weekly matrix reporting table and not the employee salary table. So let's start vault by Vault server dev. The dev flag indicates that vault server started in a development mode. This means that the server is unsealed. It gives us the root token, which is like the master password. And this is almost all the time. This is for demo purposes. Do not use this in production. All right, so now that we have the vault server up and running, let's export vault underscore ADTR so the address of the vault server on our CLI, just so that our CLI knows where the vault server is running. So I'm running the vault server on my same machine. So that's why it's local host localhost and it's running on port 8200. So once I have that, I need to enter my postgresql service password a couple of times. That's why rather than typing it again and again, I'm adding that as the environment variable as well. So once I do that, I enable the database secret engine. You can have different secret engines enabled for different providers, but this demo we are creating database secrets. So I'm enabling the database secret engine. The second step is I'm configuring vault with proper plugin and connection information. So the plugin for database is postgreSQl database plugin. I'm calling the role metrics read and write and the username and password. Rather than hard coding the value within the connection URL, I'm using a template. This is for best practices. So once I do that, the third step is I'm configuring a role that maps a name involved to a SQL statement to execute and create the database credential. So here I'm saying that for this specific metrics read and write role, the creation statement is granting all on weekly matrix reporting table so it can perform all operations. And the credential would be valid for 1 hour. So that's the default TTl or time to aiven. So these were the three steps. Enable the database secret engine, configure vault with the plugin information and then configure a role. So once I do that, I can use vault ClI, the vault read command to generate a dynamic credential like this. Or we can also make the HTTP API call to generate the credentials. You can use the vault UI as well, but I'm not showing that in this demo. So in order to make the API call we need a token. So I'm using the root token here. So here you can see that we generated two sets of postgreSQl credentials and both are valid. They will be valid for the next 1 hour or so. So once I have that, I'll go back to my postgresql service. I'll connect and then start my testing. My testing will have two steps. Step number one is I'll connect using the admin credentials and try to read these two tables and expect that I should be able to read it because those are admin credentials. Step number two would be to connect using one of these dynamically generated credentials and see if I can still see those two tables. So first I'm connected using the admin credentials and I'll try to read the weekly metrics reporting table. I can see it, it's expected. And then I'll try to read the employee salary table. I can see that because this is an admin credential. Cool. So now let's disconnect and reconnect using the dynamically generated credential. So we generated two credentials. We can pick any one of those, whether the one using the CLI or using the HTTP API. Doesn't matter, both are same. All right, so let's pick the one we generated using the CLI. So this is the username, and default Db is the name of the database. And let's copy the password for this username. All right, so since the authentication works, so we are able to actually connect to the database using this dynamically generated password. Let's check the authorization part. We can read the weekly matrix reporting table because that's how we set up the role. Now, moment of truth. Okay, so we are denied permission on the employee salary because we didn't give any permission other than the weekly metrics reporting table. So it seems that our authentication and authorization worked. Now time to test the auditing feature for Hashikar fault. By default, auditing is not enabled. So step one is enabling the audit option, and you can enable the audit in the default path or a custom path like in this case, I'm enabling it under vault underscore audit underscore one file. Once I read the audit file, I don't see any information because after enabling auditing, I didn't interact with fault. So let's interact with fault. Let's generate a dynamic credential and then let's see the audit file. This time we can actually see some data. We can see that a secret was generated, it has a lease id, and under the data we can see that the secret is not in plain text, which is expected. You don't want your credentials to be in plain text on an audit file. In this case, the data is hashed with a salt using HMaC Shah 256. If you are a system administrator, you might be thinking that how can I use this information to be able to tell who accessed my resources? In order to do so, you can actually generate hash using the same audit hash function. So in this case, let's copy this username. So let's say you have a suspection suspicion on this user and you're passing their username into let's say a payload file. And the idea is to use this payload file to make an API call using the same audit hash function, you generate the audit hash and the hash would match the hash under the log file. So now let's make the call and this is going to under Sys path into the vault audit one audit hash function. So you can see that the hash that's generated, it matches. It ends at 10 two, and the username from your audit log also ends at 10 two. So you can run one or more username or any other data on this audit hash function and then generate the hash yourself. This would let you take care of the auditing as well. So with Hashicorp vault, we checked the authentication that worked. Authorization also worked. We were not able to read the tables, which we are not allowed to. And now it shows you how you can use auditing as well. So you did everything that you were told, you followed all the best practices, but someone still figured out how to crack your infrastructure. What do you do? Do you have a break glass procedure? That means do you have a bypass mechanism that in case something catastrophic happens, do you have a procedure where you can still access your services? I'm not going to go into too much details on the break glass procedure for data infrastructure security, but you can read more on that and make sure that your organization has a break glass procedure besides having the regular procedures in place. I know this is a ton of information for 30 minutes and you might be wanting to do the demo yourself. If you'd like to do that, there's a blog link in this slide and there's a QR code that will take you to the same blog. You can follow along the blog. You can create free account on Aiven. Alternatively, you can use another postgresql service for the demo if you have one running. I'd love to hear your feedback. You can reach out using the LinkedIn or Twitter handles or shoot me an email if you have any question. I look forward to networking with you throughout the event and thank you so much for joining my talk.
...

Dewan Ahmed

Senior Developer Advocate @ Aiven

Dewan Ahmed's LinkedIn account Dewan Ahmed's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways