Conf42 DevSecOps 2021 - Online

Automatic serverless security testing: Delivering secure apps continuously

Video size:

Abstract

Serverless technology eliminates the need for development teams to provision servers, and it also results in some security threats being passed to the cloud provider. This frees up developers to concentrate on building logic and producing value quickly. But cloud functions still execute code. If the software is written poorly, it can lead to a security disaster.

How can developers ensure that their code is secure enough? They can scan for common vulnerabilities and exposures (CVEs) in open-source code. They can even scan their Infrastructure-as-Code (IaC) tool to identify insecure configurations. But what about custom code? At many organizations, the application security team struggles to keep up with the speed of development in a serverless environment. Traditional testing tools not only provide very limited coverage, but also slow development cycles unacceptably. Serverless code contains a mixture of cloud configurations and application programming interfaces (API) calls. As a result, legacy solutions lack the context that is necessary in a serverless environment, and the consequence is a lack of observability and slower response times.

Fortunately, it does not have to be this way. Organizations can leverage robust security during serverless development, automatically—if it is done properly. In this talk, we will discuss common risks in serverless environments. We will then cover existing testing methodologies and why they do not work well for serverless. Finally, we will present a new, completely frictionless way of testing serverless applications automatically—with no scripts, no tests, and no delays.

Summary

  • Tal: Forrester predicts that one out of four of you by the end of the year will use serverless regularly. Tal: Serverless is a different architecture with many, many resources. Part of it is also security, and we'll touch that drives bottom up decision making.
  • Serverless is less about a synchronous flow, it's more about an event driven architecture. Where your code is could also be your mistakes. If you're making big mistakes, that could really end up in a cloud disaster.
  • Serverless computing trend continues to grow. Can we apply traditional application security to serverless? The biggest challenges are the provision or the policy given to the function. When you talk about serverless, you lose 90% of perimeter.
  • Event injection is basically someone tacking your function with arbitrary code. Broken authentications are functions that are not performing any type of authentication. Open resources are lambdas and other services that are unprotected, unconfigured, misconfigured. Insufficient logging and monitoring.
  • Can security scale on serverless? Well, it can, but there are some challenges. Serverless functions go to production on daily basis. Who takes care of the infrastructure? It's hard to know what's important.
  • How do we test security in modern CI CD pipeline? Traditional tools are not working well for serverless. All the tools are ignorant of the environment and the context. If we want to use those security testing in a cloud native environment, we're going to get more problems.
  • So how should we do security existing for serverless? Let's take an example. Trying to run an IST on a lambda function is really an overkill. You really need a solution that is built for your infrastructureascode as code. For cloud, we should do things differently.
  • There is another open source which you can deploy on your cloud with just three clicks. Just make sure you do not install it into a production or any account with sensitive information. There are videos, tutorials and you can learn how to secure and attack your serverless applications.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hi everyone, thanks for joining me for my talk about automated serverless security testing. My name is Tal. Let's start. So why am I giving you this talk? About five years ago I joined labs, a startup based in Israel that developed a serverless runtime protection solution. Before that, I never heard about the the topic serverless before. After two years on the road, we got acquired by Checkpoint security, and after that I decided to shift further left and create my own company called Cloud Essence, a cloud native security testing company that got acquired, by contrast, security last year you can find me on social media, typically with my handler apps with four. With the four, right. Why is it this talk? Interesting. Forrester predicts that one out of four of you by the end of the year will use serverless regularly. So I think it's an important topic to talk about, and it's something we should be aware of. The security implications and challenges in serverless and serverless testing might be different. Let's see. Typically, we already know that the cloud transformation has begun. We know everyone start talking about cloud and cloud development. Cloud native. Of course, there are companies like iRobot and Skyscanner which are pioneering maybe this type of development, and they are ahead of the curve. But we can see even big organization coming ahead with cloud native development. And even if now they are in a kind of a hybrid solution between monolith and cloud native applications, they're going to move more and more and we're going to see more and more adoption into cloud development. Okay, but what is serverless really? Serverless is a lot of things. First of all, it's a different architecture. No more big monoliths, one big flow application. Instead, many, many resources. Independent OpenSource, which are configured to talk to another and other services in order to create the logic of the application. But each of these resources is independent, and we have to take care of that. Cycles are different. Developers devsecops, you might have heard hyper agile development cycles. No more waterfalls, quick iterations and quick time to value. Processes are different. Typically in the cloud, you better automate everything because you do not own the infrastructureascode. Then if you don't really automate the things, automate everything. It's really hard to get visibility and get information from the right place at the right time. So mostly we're automating things to get the information and the data that we need. The decision making is also changes, so it's less about top down. This is the time, this is the place to do something. Developers get more and more responsibility. Part of it is also security, and we'll touch that drives bottom up decision making, letting the developers get more and more responsibility. Serverless architecture. Well, this is a big picture, a small picture, sorry, of a medium sized, maybe even a small medium sized application that is built in serverless. You can see maybe a couple of dozen APIs and certain functions and some other resources may be summed up to, I don't know, 200, 300 resources. This is a small application. I've seen customers with millions of functions and resources. I cannot even start to imagine how it would look. The problem is that if you don't have this kind of look that we provide or some kind of visibility into what you have, it's really hard to understand what is connected to what, who is talking to who and what is my risks. Where are my risks here? Because if I'm a security expert or even a development manager, it's hard for me to really understand what is going on in my system. I mean, lambda functions and other functions pop up on a daily basis, if not more into production. Even so, it's really hard to follow that. That also means that we have to take care of each of these resources in a separate way. Sometimes for security reasons. We'll have to monitor each one, to authenticate each one, to perform zero trust, or any type of security that we want to apply into each of those resources separately. Well, serverless, just to understand. It's less about a synchronous flow, it's more about an event driven architecture. Something happens inside your cloud. It could be a file that was uploaded, downloaded, deleted, an API call, which is kind of common. A table or entry inside your database that was changed an Iot rule, a log, an analytic, everything that you can imagine, almost anything that you can imagine, something happened. It runs your code and your code interacts with other services and opensource in the cloud. The problem is that where your code is could also be your mistakes. And if you're making big mistakes, that could really end up in a cloud disaster, data breaches and whatnot. In general, AWS lambdas work like this. Something happened, triggered the function. AWS pins up a container for you and we'll talk about. It's not really a full container. It's like a runtime environment ready for you. It runs your code. When the code finish, the container dies. There is no more container. What about some security aspects about it? So I'll touch mostly about AWS and lambda here because it's the most common one. But you can think about some of the aspects in other cloud providers as well, like Azure, GCP, Alibaba and some others. So in AWS lambda, the environment is a read only environment. If you need to write into somewhere it's going to be temp. It's the only write permission directory that also have some security aspect. We'll talk about it in just a few seconds. The environments is not really wired to the interfaces. I mean you can connect to services, APIs, whatever you want, you can connect to the outside world. Getting inside access is not possible. I mean you cannot ssh to the runtime, you can just run your code inside, that's it. The data is temporary, meaning that when the execution ends and you're in runtime dies, the data that was there is terminated, is deleted. That is true, but for performance reasons, the cloud provider, in this case AWS, recycles environments. So in order to save time of spinning up environment new environments, if a request coming in or an event happened, it executes your code. Then another one happens. That cloud be even dozens or even hundreds at the same time, randomly, the cloud provider will take a run environment they already ran because it's up and just give it to the next coming event. That means that the data that was before, if it was not deleted by you, by the code is still there. And if you have some security issues in your code, someone can even access this data and exfiltrate it. What other security interest aspects we can think about? It is the code itself lies or resides in the environment. So in order for your lambda to run, the container comes up, your code is inside and it runs your code. So the code is there. If I have access to the runtime, I have access to your code. The keys are also inside. That means the keys that are basically the permission keys for the lambda inside the cloud. This is what lets your lambda function communicate with other services and resources. This is a big challenge in security specifically for lambda and in the cloud in general. And they're inside the environments. That means that if I get access to the environment, I get access to the keys. I can do some, maybe even bad things inside your cloud. Serverless security testing a thing well, this is pretty old, but you can see a continuous rise in serverless computing trend. This is a Google trend map. It goes and it continues to go up. There are some maybe during the COVID period there was maybe less, but it's picking up and growing with time. But you can see that for three, four years the servers security search on Google was between zero to one or two at the top one time and I'm pretty sure that from here, around here, all the times that you see one is me. So no one really talks about it. And I think we should have more awareness about server security, because even though it still contains some of the previous security aspects, application security, there are some challenges that we need to discuss. Okay, so let's say it's a thing. Can we apply the traditional application security to serverless? If so, we don't really need this talk because we can do whatever we've been doing before. Right? Let's see. Let's inspect that. So what are the biggest challenges? The first one is the provision or the policy given to the function, which is a huge thing or a big challenge, because if we're talking about one function, it's relatively easy because you can look inside the code, do some code reviews, see the API calls, write, but item, translate it into put item. Actually this is the good one. The put item here, the X is because it was an animated before it was converted into PDF. So before the put item here, there was a wild card, the star asterisk, which means the function can do any action inside your, inside the cloud, which means that if someone has access to this function or the code or the runtime, they can do whatever they want inside your databases, even if it's a database that it's not even related to your application. Because also here there was a wild card, a star meaning access to any table, whether the list privilege what we calls, which is the right permission set, is to set put item as a specific action. That means that if the function is even if it's vulnerable and someone can run your code or execute your code, or maybe give an arbitrary code for the code to run, it will be blocked by the cloud if it's not the specific policy that was given. So if I'm going to change that into delete or scan, even in the code, even if I change the code, if the policy states just put item, that means I will be blocked. And you should do it also on the resource level. So instead of putting a wild card here, you should specify the exact table in this case, which is taken from the environment variable. So we should take the environment variable value and put it inside the policy resource. Well, when you do it to one function, it's really easy, but when you have to do it at scale, it becomes a problem. What happens if you don't trust me is this. The developer will go to stack overflow or any other website will look for my lambda is unauthorized to perform dynamodb scan. Okay, I'll put this error in some forums and I'll get hey, I work with an Amazon engineer and it turns out the problem was the policy configurations. It should be dynamodb Star. No, it should not be dynamodb Star. Of course this gives the function tools many permission. It just need to do a scan. Let's see another example. This was taken from stack overflow, right? I don't even remember the question, but someone said I solved this by adding AWS lambda full access provision to the lambda. Just go to the lambda im role, blah blah, specify everything and add the provision. That should do it. No, that should not do it. You know why? Because AWS lambda full access policy looks like this and this is a tranquate version of it. There are more provision, I just put the important ones or the risky ones. Cloudwatch Star dynamodb Star events star Lambda star. You can execute any code logs Star s three star resource star. Lucky for us AWS deprecated this. So no, AWs lambda full access should not be the solution for your lambda code. Okay, we talked about the policy, let's talk. But other security challenges we have in a monolith application, usually we have a synchronous API request coming in through the load balancer, through the API gateway, whatever that is. We can put all our security tools and security capabilities in this point. So whatever comes in does through input validation, through output, filtering, through DLP, through IPS, through firewalls. So you're always almost protected, right? At least 90%. When you talk about serverless, you lose the perimeter. That means that the attacker can really get into your code from different things that you haven't thought about before. It can be through an API? Yeah, of course. But it can also be from someone uploading a file, someone performing analytics code, commits, log processing, database changes, and it just execute your code. And there is no middle between database, the database and your code. So if someone changed the database, you cannot say, hey, before you run my code, transfer this data to me. No, this is not controlled by you, so you have to put the security inside the code. You remember this now protect each and every one of these. Well, if it's not automated, it's not going to happen. There are some other security, several security risks. We're not going to talk about all of them. I'll refer you to information about them, I'll just mention some of them. So event injection is basically someone tacking your function with arbitrary code. Broken authentications are functions that are not performing any type of authentication, just relying on the incoming data sensitive data exposure. Of course, lambda contains sensitive data like keys and codes and some secrets in your environment variables. So if you're not testing yourself, you might be at risk. Over privileged function, we talked about it. Vulnerable dependencies, well, that's not new just now. It's in a lambda function. Insufficient logging and monitoring. Well, AWS logs and monitors pretty much everything. You just need to connect to the right location and collect the right data and matrices. Open resources are lambdas and other services like s three buckets API that are unprotected, unconfigured, misconfigured, allowing anyone to access them. Denial of service and denial of wallet are the ability for someone to either block your lambda functions because of the limitations they have, or letting you pay for any lambda executions. Insecure shared space we discussed this earlier about the data in slash temp that is shared between random executions and of course insecure secret management, because your lambda can have keys and secrets inside the code or the configuration which are not protected. Well, can security scale on serverless? Well, it can, but there are some challenges. There are a lot of services, lambda is just one of them, but it connects to many other, there are frequent development, it's not a monolith application with downtime. Serverless functions go to production on daily basis. What is connected to what? We discussed this, it's hard to know if you're not the developer that wrote the specific function, what it is connected to. And in this case, of course, even before there are many developers, less appsec and security teams, of course. So it's hard to follow what's important. Well, my lambda could have permission to do something meaningful in my cloud, but it might not be connected to anything that allows an attacker to access it, or vice versa. I have a lambda which has a code injection. Okay, very risks, but if the lambda permissions allow it just to write into the logs, that means that even if someone access the code runs arbitrary code, the function can only write logs. Not that it's not important, but it's less risky than someone reading data or modifying my files on an s three bucket. And even more so, it's hard to know what's important. Is the security the same? Well, we talked about some aspects, it's not exactly the same. We'll see about some other aspects in a few minutes. And there is another question. Who takes care of the infrastructureascode lambdas are in the cloud, permissions are configuration. Could be an appsec team, a security team, could be the developer, could be the DevOps team, could be the cloud engineering. I've seen basically everything from everything, so it's just hard to understand and who takes responsibility in these cases. All right, so we talked about the security aspects of serverless, but how do you test for security? I want shift left, right. We don't want to just say, hey, we have a tool in production runtime protection. We're good. No, we want to know that we're shipping secure code. So how do we test security in modern CI CD pipeline? Well, let's take the traditional one. I want a SAS here that runs on every commit I want to is maybe something more accurate that runs on integration tests. So some security tests in the integration in the e two e test. And I want to test like a desk test. When the product is ready, it's shipped. I have a website staging whatever it is, and I want to test it. Well, these are the traditional tools and I'd say those are not working well for serverless for several reasons. The normal ones are, that even happened before. But that SAS or static analysis gives a lot of false positive because it doesn't have a context. It just frees text. So text. So it's hard to understand what's important, what not. That also means that the developer needs to work a lot to filter them and let the security team configure what is important and what is not. Because if I'm going to test for all the vulnerabilities or the security policies, I'm going to get thousands of results. Meaningless, really. All right, so let's do an is test interactive application security test. Well, that is good. The coverage is a problem because you have to write tests in order to cover to get coverage. And the security teams need a lot of work to instrument your code in order for it to work. But then something doesn't work and you don't know if it's the is plugin instrumentation or your code is our latency. So it's not really working so good, especially when we talk about cloud native environment das. Okay, those are good. They're not really CI CD tools. It's really hard to operate them inside the pipeline. It requires a lot of work from both the engineering and the security teams. Usually they don't find anything meaningless. Sometimes they do, but their coverage is relatively low. There is a lot of work between communication that needs to be done, between the development, the engineering and the security, because they need to know when they can tests, what they can test on what environments. Basically you need to keep the environment alive with new data, calls the time, then frees it let the security team test, give you the results. The developer will go over the results, try to fix or understand what's going on. Continuing to talk with the development, the security teams fix. Go back to the security team, say, hey, I fixed this, can you retest? Yes. Next week we have another cycle, let's retest everything together. And there is a time that has passed and the testing was not done. And you need to ship your lambda today. So doesn't really work. What we need is something else. The problems are that if we want to use those security testing in a cloud native environment, we're going to get more problems. First of all, there is not just code. All the tools pretty much are ignorant of the environment and the context. What is the environment and the context? Lambda is not just code. It is connected to an infrastructure, to the cloud. And the cloud means a lot of things that we need to know, like configuration and resources and services. And you need to understand that a lambda is not an app. It starts somewhere, runs the code, it finish. Then there is another service that picks up, maybe the lambda write into the database. But then when it writes into the database, you have a configuration that runs another data, that pulls the data from the database, performs some action, and submit a report. So it's not something that you can really test like this. Also, tools are completely blind to known edge devices. All the security tools are built to support synchronous application with some kind of can entry point HTTP or some kind of traffic coming in. So if you want to test or fuzz your code, let's say take a dust, right? A dynamic tester, you need to give it can endpoint to start working. Some lambdas don't have entry points. Not entry points, sorry. Some lambdas don't have endpoints, they don't have URLs, they don't have APIs. I seen a system with, let's say something small, maybe 200 functions. Yeah, ten of them had APIs. Some of them, I'd say 90, 80% of them don't have APIs. So you cannot test them in a traditional way. All of these, the issues that we talked about really block the development and disrupt the CACD. That means it's very hard to scale in the pace of cloud native development and they're not good enough. And when we're in the cloud, we should get better. So how should we do security existing for serverless? Let's take an example. This is a tiny application, really just three lambda functions taken from Amazon.com. It's based on the irobots from Roomba, right? It's just the registration service you bought can irobot you open it for the first time and it sends one request, register your robot and there is a process, there is a lambda that process, something with IoT writes to the logs, put data into the queue, and then another function picks up this queue, continue to run, send it to another lambda and this lambda communicate with other services, IoT services. Really very simple. Let's see how we should test this. Easy, right? We can scan the image, right? If you do have an image and you want to just run an SCA on it, right, it's going to give you 10% coverage, 50% coverage, maybe even less, I don't know. And you just find potential problems that you imported because it doesn't really mean you're vulnerable. You just imported some issues. I'm not saying it's not important, it is important, but it doesn't give you any coverage for your code, your configuration, your cloud, zero. Usually those things are even provided out of the box by the cloud provider so you can use them. And I think you should, but it's not enough. So what we should do, I know, infrastructureascode as code. We all use infrastructureascode as code now, right? Terraform, pulumi, serverless framework, whatever that is. That's great. Shift left as far as you can go. But again, you get limited visibility, right? Because you just see configurations, zero code coverage. No one will tell you there you have a problem with your code. It will just tell you this line is vulnerable because you did not add encryption, which is good, I'm very API for it, but it's not enough. You get no logic, no prioritization, and it's IAC dependent. So you really need a solution that is built for your infrastructureascode as code. And again you get zero code coverage. That's not enough, right? To get code coverage, let's start using is a modern app tech tool, maybe the most accurate and reliable one, really enables developers devsecops, but there are no servers to instrument, right? We're running on a lambda function. Trying to run an IST on a lambda function is really an overkill and hasn't worked before. So let's try another solution. Let's run a saft static analysis security testing. Well, looking into this, really the saft will see three different apps, just three, because there are three lambdas, because it cannot have the full overflow flow, because the code is not connected to each other. This code does not continue here inside the code. It needs to understand that there is a configurations that says to write to this queue and this function reads from this queue and then connect them together. But it's not possible because it's not in the code, it's in the configuration. So it doesn't see a source or a sync, doesn't understand where is the databases or the queues or anything. So really SAS will give you bad results, false positive and really false negative because it doesn't see things. So what should we do? Run a dast, a dynamic application, security testing? Well, you could, and you'll be able to test this API specifically because that's the only one with a URL or an API. So this function maybe, but I'm not sure what actually you will get from it because I understand, if I understand this correctly, it's not a synchronous flow, right? So the roomba will send an API request which will return or 200, okay, 403, unauthorized, 404, whatever that is. But the rest of the application and the process doesn't happen yet. So the API fuzzer will get always or okay or unauthorized. You can test some things, I'm not saying you cannot, but most of the flows and the coverage will not be able to run. But there is a solution. Because we're in the cloud, we should do things differently and better. For example, what we do is we build something that connects into the cloud. So once you connect into the cloud, you can get all the information from the cloud, you don't need to do anything. So three clicks, you get your template, connect your cloud formation template, or whatever infrastructureascode is code you're using, you run it, you connect to the cloud, you get the right permission to do it. You can run discovery, get all the information, all the resources, calls, the relationships, calls, the interfaces, the policies or services in the environment, and connect them together into the graph that I showed you before. Then you can start analyzing your weaknesses, your risky points, your code, your attack surfaces, and try to understand where there might be problems. And because the cloud is built in a way that every service or everything works with something that is pre built by the cloud, you can also simulate those things. So what we do is we automate security simulations on lambda functions in this case. So let's say your lambda, get an API call and write into the database. This is what we're going to tell the function to do. Take this input, it's an API call. It's not, but we can simulate that and try to write into the database. And let's see what happens. Maybe I can write a file, maybe I can access different tables, maybe I can delete data and then I can also check if I actually did it because I'm inside the cloud, right? So let's say I'm trying to upload a file, I can check if the file was uploaded because I'm there. If I have the right provision, then of course if something happened, we can report. And the best being here, other than the three clicks instrumentation, let's call it, is that you can do it continuously. You can continue to monitor the environment. You don't need to run asynchronous or point in time scanned. We continuously monitor the run the cloud. If now you're going to push or deploy a new code or a new configurations, we'll pick it up and we'll test it automatically so you don't have to do anything else. Everything happens autonomously in the background. New code, new can, new configuration, new scan. You fixed your vulnerabilities, we'll retest it automatically, and if you fixed it, we'll just eliminate the issue so you don't have to even interact with the security team on that. This is an illustration of what we're going to do. So the developers creates new, deploys new code, new API with a new function. We're going to test this specific flow and identify potential vulnerabilities. And once we do, we'll know also what else in your cloud is at risk. If someone managed to do that, we'll know into the specific table, the specific action inside the table. There is also a nice example here, what I showed in black hat two years ago, where I hacked a lambda function with my voice talking to an Alexa device. Well, this is something that developers did not expect, right? Because it's not something that you're used to before, but you should take that into consideration now. All right, we're getting close. So our tool can automatically give you vulnerabilities and maybe even better policies out of the box. Copy paste into your environment, get a list privilege permission for each of your functions without doing anything. We're scanning the code, we're emulating the code, we're looking into the policy, we're seeing what the function actually needs, and then what we give to you. Okay. Of course, I cannot cover everything. So what you can do to learn more, first, there is the OS service, top ten project. Trying to OS, if you're unfamiliar with it, is an open organization, the most famous one for application security. And there is a project I lead together with some colleagues around the serverless world, or industry trying to identify the top risks for serverless. Right now there is an open call, so if your organization works with serverless and you have someone with some insight into security issues, security risks, please click this or go into this address filling the forms we'll take it into consideration. All the data that is sent is anonymous. Of course we'll collect and it's public so we want to get the best results from the industry. Lastly, there is another open source which you can deploy on your cloud with just three clicks. It's a DVSA. It's a damn vulnerable serverless application that I created, completely serverless and you can install it with just three clicks. Really, you just need can AWS account and the right permissions to install. Just make sure you do not install it into a production or any account with sensitive information because it's a vulnerable application and it's potentially going to give someone access to your data. Go here. Learn more. There are videos, tutorials and you can learn how to secure and attack your serverless applications. That's it. Thank you very much for participating in this call talk and you're welcome to shoot me an email anytime if thanks.
...

Tal Melamed

Senior Director @ Contrast Security

Tal Melamed's LinkedIn account Tal Melamed's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways