Transcript
            
            
              This transcript was autogenerated. To make changes, submit a PR.
            
            
            
            
              Our name
            
            
            
              you hello there and thank you so much for joining me
            
            
            
              here at Conf 42 integrating cloud native security
            
            
            
              into the SRE culture. It's really great to be here.
            
            
            
              I hope you're enjoying the conference so far. Now,
            
            
            
              my name is Anis Oles. I'm the open source developer advocate at acro
            
            
            
              security and in this talk I want to
            
            
            
              speak about the overlap between site reliability engineering and
            
            
            
              cloud native security. How can both benefit from each
            
            
            
              other? How can we integrate cloud native security tools best practices
            
            
            
              into our site reliability engineering? Now,
            
            
            
              last year I was actually working as SRE in between positions
            
            
            
              as developer advocate. I'm also a
            
            
            
              CNCF ambassador since 2021 and
            
            
            
              I have a YouTube channel that you can see here where I talk about cloud
            
            
            
              native tools and trend tutorials on how to set them up, how to
            
            
            
              use them most effectively. And I have a weekly DevOps newsletter where I
            
            
            
              share amazing content from across the space with the
            
            
            
              community. So if you are curious, do check those resources
            
            
            
              out. You can find all of the links at my twitter.
            
            
            
              I also have a puppy. She's just five months old
            
            
            
              and she might make a little bit of noise in the background, so I apologize
            
            
            
              for that. She's making up for it by being super adorable.
            
            
            
              However. Now, last year when I was working
            
            
            
              with SRE, the SRE team at
            
            
            
              the startup was just about setting up all of the practices
            
            
            
              and the SRE culture within the company.
            
            
            
              And this is a slide taken from our Kubecon talk
            
            
            
              that a manager of mine and I gave in
            
            
            
              2021 at Kubecon North America.
            
            
            
              And we basically talked about the different Kubernetes operators
            
            
            
              that we were using across our infrastructure.
            
            
            
              And the infrastructure was very dynamic and very
            
            
            
              complex, let's say, because we had basically
            
            
            
              several superclusters across different regions across the world,
            
            
            
              Frankfurt, New York, London, you name it.
            
            
            
              And basically what we had there were compute
            
            
            
              racks, and those compute racks were our supercluster.
            
            
            
              And in those compute racks we have compute nodes,
            
            
            
              and you can see them depicted here as these drawings that these
            
            
            
              are all compute nodes and the details will
            
            
            
              matter here. But basically tenant clusters, customer clusters
            
            
            
              would then be scheduled on those compute nodes within. So we
            
            
            
              would have clusters within a large supercluster.
            
            
            
              And as you can imagine, you need very advanced
            
            
            
              observability tools to get the necessary
            
            
            
              insights to understand what is going on. Where, for example,
            
            
            
              if a tenant cluster is stuck in some state and
            
            
            
              you need to repair it manually, you kind of need to kick it, you need
            
            
            
              to know how to identify those tenant clusters right in the
            
            
            
              easiest and fastest way possible the thing
            
            
            
              is, at the time we were talking a lot about our operator design and
            
            
            
              about our observability tools, but we weren't really talking about cloud
            
            
            
              native security. So before we go into cloud native
            
            
            
              security, I want to talk a little bit more about the SRE culture.
            
            
            
              I mentioned that we were really focusing on establishing an SRE culture
            
            
            
              and that really focused
            
            
            
              on these different areas. The first one is continuous improvement.
            
            
            
              You don't want to keep the state of your services the same, even though
            
            
            
              things might be working. You want to continuously improve your setup
            
            
            
              and the tooling that you have in place to gain insights on your services.
            
            
            
              And that's also related to embracing risk.
            
            
            
              Of course you want to keep the risk profile low and not deploy
            
            
            
              something that might bring out all your infrastructure accidentally.
            
            
            
              But ultimately it's a balance between both, because without
            
            
            
              embracing risk and taking new step in advancing
            
            
            
              your tooling, you can't really improve the tooling itself
            
            
            
              and it will slowly deteriorate. Then the other thing is analyze
            
            
            
              learnings, analyze your failures and
            
            
            
              learn from them. We had lots of incidences
            
            
            
              in the early times, so we had
            
            
            
              lots of incidences and varying in all times
            
            
            
              of severity and degree. We were using tools that at
            
            
            
              a scale that haven't been used to that scale before.
            
            
            
              So we encountered some really, some edge cases.
            
            
            
              So a lot of times we had to sit down with the other companies,
            
            
            
              with the other projects and really analyze what has happened. So both
            
            
            
              the projects, well, both our company, but also the projects can learn
            
            
            
              from that. And the last thing is autonomy. It was my
            
            
            
              first SRE role where I had experience in the cloud native
            
            
            
              space. I didn't have experience working with production environments,
            
            
            
              but I received a lot of autonomy. And I think that was really beneficial to
            
            
            
              have that trust and focus within the team. So that's ultimately
            
            
            
              what the SRE culture is about. To me. I know
            
            
            
              it's different across different teams. You will have different implementations and similar,
            
            
            
              but this is kind of what you can think of when you think about the
            
            
            
              SRE culture. What is devsecops? Usually I
            
            
            
              ask people what devsecops is, but then people are really
            
            
            
              shy and really, this is a conference about devsecops.
            
            
            
              So I think that everybody has kind of an idea of what
            
            
            
              devsecops is. So just think to yourself, okay, this is what I think about
            
            
            
              when I hear devsecops. Some people might think about
            
            
            
              buzzwords, some people might have specific terminology in mind.
            
            
            
              Now I think about integrating security into all of our business functions by
            
            
            
              empowering people and creating accountability.
            
            
            
              And every word here is kind of, I carefully picked every
            
            
            
              word here. So we want to incorporate security into every
            
            
            
              business functions, whether that's administration or engineering. Because ultimately,
            
            
            
              if everybody's empowered to take ownership of their part
            
            
            
              of what they are working with, right, then you can cover all
            
            
            
              areas within the business. So it's really about empowering
            
            
            
              people to take that ownership, to know what they are supposed to do,
            
            
            
              how they can do it, how they can ask for help and similar.
            
            
            
              And then when things go wrong,
            
            
            
              if they happen or don't happen both ways, if things go good, but also if
            
            
            
              things go bad, you then bet, then you can create accountability and
            
            
            
              have productive, more productive conversations, right? It's not about finger pointing,
            
            
            
              it's about having more productive outcomes in the end. So that's
            
            
            
              what devsecops is all about to me, to really
            
            
            
              make things happen across the business by
            
            
            
              shared ownership. So next
            
            
            
              thing, if you're working with anything away from this talk,
            
            
            
              it should be that SRE practice and security practices
            
            
            
              have a really tight overlap. Ultimately, when we define what healthy services
            
            
            
              look like, we should also define what secure services look like,
            
            
            
              because only secure services are
            
            
            
              healthy services. So that's basically what this
            
            
            
              talk is boiling down to.
            
            
            
              When I moved from my work as
            
            
            
              SRE back into developer advocacy for open source
            
            
            
              tools at Aqua Security, I realized that there's such a strong overlap
            
            
            
              between both and it just doesn't make sense to completely decouple
            
            
            
              them. And I know in many businesses you will have a separate security team.
            
            
            
              That's a great thing, right? But at the same time, we should
            
            
            
              also see, okay, how can different areas
            
            
            
              benefit from each other, and how can we make, for example, something like integrating
            
            
            
              security as easy as possible? So the idea is,
            
            
            
              start with, if you have an SRE team, if you have people focused on
            
            
            
              observability, start with those.
            
            
            
              So here SRE, some additional goals that you might have within
            
            
            
              your SRE team that are also security goals,
            
            
            
              or they're tightly coupled, let's say. So when we
            
            
            
              focus about how we can scale our services, we also have to talk
            
            
            
              about how can we keep those services secure over time as they become
            
            
            
              more complex, as we scale up and down our services based
            
            
            
              on demand, our replicas, if they scale,
            
            
            
              then we also have to talk about, okay, how can we keep those secure?
            
            
            
              The next thing is visibility. Within your observability
            
            
            
              tools, you obviously want to gain visibility, insights into what's deployed,
            
            
            
              where, how is it deployed, who deployed it, when was it deployed,
            
            
            
              how is it interacting with other services? Is it maybe causing failure
            
            
            
              in other services? And similar. Those are all questions and topics related
            
            
            
              to visibility. Now, when you are getting started with cloud
            
            
            
              native security, you want to focus on security scanning,
            
            
            
              you want to focus on getting more insights into the
            
            
            
              security posture of your services. And all those is also contributing to
            
            
            
              how do we gain more visibility into those different areas.
            
            
            
              The next thing is reduce noise. And torial, there's something I'm going to talk
            
            
            
              about in a little bit more, which is called vulnerability fatigue,
            
            
            
              and which basically means that you're bombarded with
            
            
            
              security issues and you can't keep up with fixing them or taking
            
            
            
              care of all of them. So within your
            
            
            
              cloud of security, you want to focus on the most
            
            
            
              productive and the most efficient information
            
            
            
              that you can take actionable steps from.
            
            
            
              Similarly, within your SRE team, you might have thousands,
            
            
            
              thousands of logs that you can't filter through, obviously manually.
            
            
            
              So similar to that, you want to have processes,
            
            
            
              workflows, but also tools in place that help you to reduce that, all that
            
            
            
              noise. The next thing is automation. Automation is great for
            
            
            
              different aspects. It's making our lives obviously easier.
            
            
            
              But I'm going to talk a bit about the downsides of automation and what we
            
            
            
              have to be careful about when we do automation for
            
            
            
              SOE work, for observability tools, but also for cloud native security.
            
            
            
              The last thing is what I already mentioned, ownership,
            
            
            
              communication is key for both areas.
            
            
            
              So here SRE, some of the more practical items that SRE just
            
            
            
              what a lot of SOE teams do, what we can also
            
            
            
              adapt for our cloud native security. Getting started with
            
            
            
              cloud native security, the first one is investing in runbooks and documentation.
            
            
            
              So when we define how to respond to different types of incidents,
            
            
            
              when to escalate an incident, what steps to take during an incident,
            
            
            
              the same thing we can do for any security
            
            
            
              issues that we might have within our tooling. So we
            
            
            
              could, for example, define okay, if there's a critical vulnerability, what steps
            
            
            
              have to happen, who has to take those steps in similar then
            
            
            
              the other items. SRE really also something that can be
            
            
            
              adapted for both teams. If you have different teams, if you
            
            
            
              have security teams or people focus on security versus people
            
            
            
              focused on site reliability engineering, or you can integrate
            
            
            
              one into the other. So here SRE, some of
            
            
            
              the tools that we used in that startup that I mentioned
            
            
            
              where I was working, SRe. So the observability tools are
            
            
            
              really like your standard stack, I would say. With Grafana and Prometheus Jaeger,
            
            
            
              we tried to install temple. We used Grafana Loki for
            
            
            
              logs. For management, we mainly used helm and terraform.
            
            
            
              It was very much helm terraform focus and
            
            
            
              then we used GitLab CI CD pipelines.
            
            
            
              But we talked a lot about these different tools and the different integrations
            
            
            
              and installation of those different tools.
            
            
            
              However, we didn't talk about security tools.
            
            
            
              That's like something we didn't really talk about. We had at some point,
            
            
            
              I mean, we were following security best practices, right? Like, don't think we
            
            
            
              were not. But at some point we had an intern
            
            
            
              who was a university student who was helping us
            
            
            
              implement tools such as Kubebench
            
            
            
              from Aqua Security as well. Now,
            
            
            
              just quickly mentioning every tool that I showcase
            
            
            
              here from Aqua, these are all Aqua's open source
            
            
            
              tools. I am not promoting any enterprise tools in
            
            
            
              this talk. So you
            
            
            
              don't have to sign up. It's all used for free on GitHub.
            
            
            
              You're not sending us any data. Similar.
            
            
            
              So since there is so little conversation about how we
            
            
            
              can actually get started with cloud native security, for example
            
            
            
              in your SRE team and similar,
            
            
            
              I've thought about okay, here are different steps that you can take.
            
            
            
              It's one approach, right? There are different approaches.
            
            
            
              This might be one approach. So we're
            
            
            
              going to focus as security scanner. As our main
            
            
            
              security tool, we're going to focus on Trivi. Trivi is an all in one security
            
            
            
              scanner. All in one because it can scan all of
            
            
            
              those different scan targets. It also has s Bom
            
            
            
              functionality features and cloud
            
            
            
              provider account scanning, starting with AWS. It also can do
            
            
            
              in cluster scanning of running workflows. So it's a very, very versatile
            
            
            
              tool that's focused on different users and different workflows.
            
            
            
              So step one in our ten step journey
            
            
            
              is understanding your need. That's really important because if you have no
            
            
            
              idea what you're actually aiming for, then you don't know what to look out for,
            
            
            
              right? So our need will be influenced before we can
            
            
            
              define our need. We have to be aware of the influencing
            
            
            
              factors on that, on our goals, on what
            
            
            
              we actually need to accomplish. So the first
            
            
            
              one is the size of our team, right? If you are working as an individual
            
            
            
              contributor, the needs for the different tools
            
            
            
              and the way that you need to integrate security tooling
            
            
            
              and practices will be different. If you're working within a large scale team,
            
            
            
              the next thing is the industry you're already working with.
            
            
            
              Is it a highly regulated industry that requires you to choose
            
            
            
              specific tools, work with a specific
            
            
            
              stack? Or are you working for a startup where it just makes things work
            
            
            
              in the best way possible with the tools available, then the
            
            
            
              type of technologies you're working with, it's also related to the
            
            
            
              integrating that are available. Do you need to have a custom setup with your
            
            
            
              custom on premise infrastructure that
            
            
            
              your need will be quite different to somebody who's managing
            
            
            
              can open source project for example, or managing,
            
            
            
              I don't know, a small retail website.
            
            
            
              Right then the company goals and leadership.
            
            
            
              A lot of times security, whether to acquire the skills or
            
            
            
              the tools, is related to having budgets and expertise,
            
            
            
              right? It's usually something that people keep
            
            
            
              as last thing to do to take care of, which is obviously
            
            
            
              an issue. But yeah, it's one of the factors that you want to
            
            
            
              take into account. It doesn't mean when you want to get started with cloud native
            
            
            
              security and integrating cloud native security, it doesn't mean you need to have a
            
            
            
              budget and expertise already available within your team.
            
            
            
              It just means that that is one of the factors that can influence
            
            
            
              which tools you're using in the end. Now tools will
            
            
            
              differ in different ways. That's also something you want to keep in mind. The first
            
            
            
              one is the installation. Different tools are installed differently. A lot of the
            
            
            
              cloud native security scanners are used as CLI tools,
            
            
            
              so you use them either in your local terminal or in your CI CD
            
            
            
              pipeline. Other tools come as Kubernetes operators
            
            
            
              and other Kubernetes resources and can be installed within your cluster.
            
            
            
              Now you want to be worried about the tools that do something within
            
            
            
              your cluster because security scanners will need lots
            
            
            
              and lots of privileges within your cluster to perform proper security
            
            
            
              scanning. So whenever you are signing up to a
            
            
            
              tool and you give it access to your cluster,
            
            
            
              you want to be mindful of what is it actually doing within your cluster,
            
            
            
              who's getting that data from those scans versus
            
            
            
              if you install, for example, an open source Kubernetes operator within your
            
            
            
              cluster and it performs just the scans within your cluster and the reports
            
            
            
              and resources of the scans are only available within the cluster. Then you know
            
            
            
              it's really contained there within your existing environment.
            
            
            
              Next thing is scan coverage. We get lots of questions
            
            
            
              in trivia, in the project issues and so on,
            
            
            
              where people asking why does this scan from Trivi differ from
            
            
            
              that can from another tool? And basically Trivi
            
            
            
              has a trivia database which is a separate project under the Aqua open source
            
            
            
              umbrella and it's pulling from different data sources,
            
            
            
              for example, list of vulnerabilities. Then the next thing
            
            
            
              is on how tools differ in quite a significant way is the
            
            
            
              number of integrations and the type of integrations available,
            
            
            
              especially if you're going with an open source security scanner.
            
            
            
              You want to be mindful of the integrations that are available,
            
            
            
              so more mature scanners will have more integrations
            
            
            
              available. Usually the last thing is the focus.
            
            
            
              Different tools are focused on different people, different type of audiences. Some might
            
            
            
              be focused on security professionals, others are focused on engineers.
            
            
            
              So here is can example of need
            
            
            
              driven development from device engineering
            
            
            
              blog. They basically detailed how they changed their security
            
            
            
              scanning to gain better insight into the security posture of their
            
            
            
              services. And here are the four goals that they want to accomplish with
            
            
            
              that change. The first one is assign ownership of vulnerabilities.
            
            
            
              They wanted to have people, different people within the team,
            
            
            
              take ownership of different vulnerabilities. So actually somebody,
            
            
            
              it's going to be somebody's job to take care and fix that vulnerability.
            
            
            
              The next thing is they want to have a global view of the security state
            
            
            
              of services. And that's very important because only if you have
            
            
            
              a global view, that's not helpful to analyze
            
            
            
              specific services, right? And to fix specific service, but only
            
            
            
              if you have a global view, you can then see how
            
            
            
              other changes, wider changes, for example in your workflows.
            
            
            
              Adopting other tools, external tools, has an impact
            
            
            
              on your overall security posture.
            
            
            
              Then they want to develop dashboards for different users and requirements,
            
            
            
              and that's more related to breaking down the security issues
            
            
            
              related to specific services. And they want to overcome difficult to
            
            
            
              use in different uis. A lot of times in the cloud native ecosystem, whenever you're
            
            
            
              using a new tool, you're adopting a new workflows
            
            
            
              and you're adopting a new UI and interface and frameworks,
            
            
            
              and that takes time to first of all get used to them, to learn
            
            
            
              your way around it, and you will always then have to do something separate
            
            
            
              to what you have already been doing. So they wanted to integrate
            
            
            
              their tools, their tooling, their security tools into their existing workflows. To have
            
            
            
              just this one thing
            
            
            
              to go to. Then step two, once we
            
            
            
              know what we actually want to do,
            
            
            
              what we want to achieve, and how different tools differ and
            
            
            
              so on, and what factors we have to keep in mind, we want to choose
            
            
            
              a cloud native security scanner. Now here
            
            
            
              is a list of different cloud native open source security scanners
            
            
            
              in the space. And they SRE focused on different types of scanning. For example,
            
            
            
              some SRE just focused on vulnerability scanning, others are focused on infrastructure as
            
            
            
              code misconfiguration scannings. Others are compliance scans.
            
            
            
              Now compliance scans, for example, would likely more
            
            
            
              be used by security professionals versus in cluster
            
            
            
              scans might also then be used by cluster admins.
            
            
            
              As you can see, trivia is really across those different areas
            
            
            
              since it's an all in one security scanner. It does lots
            
            
            
              of different things, but if you just need vulnerability scanning, you might want to
            
            
            
              consider, for example, another tool that focuses on vulnerability scanning.
            
            
            
              And here's the list. Now once we have
            
            
            
              looked at the different scanners, in our case we're going with trivia
            
            
            
              because I'm familiar with trivia. We want to set it up
            
            
            
              and make sure everything is running properly. And sometimes you
            
            
            
              might go with one scanner and then you set it
            
            
            
              up and you play around with it and you realize it's not the right tool
            
            
            
              either because the workflow is not intuitive for you or
            
            
            
              something is just not working and it's completely fine to go back to step two
            
            
            
              and be like, okay, we actually want to use a different scanner now.
            
            
            
              In our case we're using trivia now we want to make sure it's working
            
            
            
              properly. So the first thing is identify the best installation options.
            
            
            
              Also trivia comes in different installation options. Now I usually go with helm
            
            
            
              installation inside of my cluster in addition to having automated
            
            
            
              CI CD pipeline scanning, then you want to decide
            
            
            
              upon a different configuration. For example, if you're
            
            
            
              using trivia in combination with observability tools such as Prometheus,
            
            
            
              you have to configure some parts slightly different.
            
            
            
              You then want to test those custom configurations and
            
            
            
              ensure that it's working properly with all tools that it's supposed to
            
            
            
              work with. So for example, if you have some niche
            
            
            
              cases where trivia is supposed to perform,
            
            
            
              I don't know, a thousand vulnerability
            
            
            
              scans of different containers, right?
            
            
            
              And then on a regular basis, something like that, like some really edge
            
            
            
              case, you want to test it out in a small
            
            
            
              environment first before and that's with every tool, right? You want to
            
            
            
              test out your specific edge case in a small environment before
            
            
            
              you implement it in a large scale environment.
            
            
            
              Now here is an overview, very simplified overview of a Kubernetes
            
            
            
              cluster, how that might look like once you installed trivia,
            
            
            
              the first thing is you have like maybe an application namespace
            
            
            
              with all your application related resources. Then you have a monitoring namespace with
            
            
            
              your Prometheus Grafana, other observability tools and
            
            
            
              then you have your trivia system namespace with the trivia operator. Now the trivia
            
            
            
              operator is that part of trivia that does continuous in cluster
            
            
            
              scanning of your running workloads.
            
            
            
              In addition to that, you could then also use trivia, the CLI
            
            
            
              tool in your CSCD pipeline or also on your developer
            
            
            
              machines. The beautiful thing is if everything is a Kubernetes resource,
            
            
            
              you can then use the same processes across your stack. So for
            
            
            
              example, here you can use the same processes if everything is a Helmchart
            
            
            
              processes Grafana as a Helmchart to view operators and Helmchart you can
            
            
            
              deploy and manage those applications through the same processes,
            
            
            
              which is really nice, really handy. So here's what you
            
            
            
              will then see inside of your trivia system namespace.
            
            
            
              Now alongside the trivia
            
            
            
              operator you will then have also several kubernetes,
            
            
            
              custom resource definitions, deployed crds
            
            
            
              and they basically extend the Kubernetes API to
            
            
            
              allow for custom security scans.
            
            
            
              So here we have the metrics of our different security scans.
            
            
            
              Trivia does vulnerability scans of any container image it
            
            
            
              finds inside of your cluster. It does exposed secret scans.
            
            
            
              Are there any exposed secrets within your cluster then?
            
            
            
              Is there any RBAC misconfigured,
            
            
            
              any role based access control that should be changed?
            
            
            
              Maybe. And then it also does config audit scans.
            
            
            
              Now the thing is, things might change dynamically and
            
            
            
              it shouldn't. And inside of your cluster, right, like people might change things
            
            
            
              around manually, they might try out things, they might deploy set containers
            
            
            
              to debug things. I don't know what your company or team does,
            
            
            
              right? But trivia will then identify any misconfigurations that
            
            
            
              are present within your cluster of those newly set up resources and
            
            
            
              can alert you on those. Now these
            
            
            
              SRE, just the metrics from
            
            
            
              the security scans, from the security reports, the security reports
            
            
            
              itself, they are just other Kubernetes resources. They are yammer manifests, the security
            
            
            
              reports and you can read them like Yaml manifests. And then because they
            
            
            
              are yama manifests, they are kubernetes resources to security
            
            
            
              reports. You can export them. For example, you can get the metrics out
            
            
            
              and then you can integrate them to your observability stack.
            
            
            
              That's the next step, setting up a dashboard.
            
            
            
              So we have Grafana Prometheus installed, we have our security
            
            
            
              tools installed. It's time to set up a nice dashboard.
            
            
            
              This is the dashboard created by the community where we have
            
            
            
              a summary of our different vulnerabilities and
            
            
            
              they are broken down in severity. So in total we
            
            
            
              have 175 vulnerabilities in our cluster.
            
            
            
              Now you can also see all of the other metrics
            
            
            
              directly through a dashboard as well in Grafana.
            
            
            
              And basically by breaking out those different
            
            
            
              vulnerabilities into different categories, it then makes it easier to
            
            
            
              identify the different types of vulnerabilities that
            
            
            
              you have. Now the next thing is, what you might think
            
            
            
              already be thinking about is how do you avoid vulnerability? Hell, because if
            
            
            
              you have 175 different vulnerabilities, how do
            
            
            
              you go about addressing them, how do you go about managing them? Those are,
            
            
            
              I'm not swearing, those are a lot of vulnerabilities
            
            
            
              here, right? That we can't manage all
            
            
            
              at once. Here's a screenshot from Alex Jones
            
            
            
              on Twitter saying I just give up and I
            
            
            
              just give up and die. No, then difficult sentence.
            
            
            
              Anyway, so he scanned
            
            
            
              a research, I don't know what type of research he actually scanned, but he scanned
            
            
            
              a research with sneak and found
            
            
            
              over 550 different vulnerabilities. And they are broken
            
            
            
              down in critical, high, medium and low as well. But still there's
            
            
            
              lots of vulnerabilities you can't look at 550 vulnerabilities
            
            
            
              or similar, right? Doesn't work. So here are some practical
            
            
            
              steps that you can take. First one is ignore all but critical vulnerabilities.
            
            
            
              He only has three critical vulnerabilities that's easy to address.
            
            
            
              Just take care of the critical vulnerabilities first, and then
            
            
            
              look in a more productive way at the rest. Don't scan
            
            
            
              everything at once. I don't know if they scan just one resource or if you
            
            
            
              scan multiple resources, but there's really no need to scan everything at
            
            
            
              once. Just scan the most critical workloads first.
            
            
            
              Filter by vulnerabilities with known fixture trivia allows
            
            
            
              you easily, with an additional flag to just specify that you only want to see
            
            
            
              vulnerabilities that already have a fix available. So you could
            
            
            
              go ahead and do that. Just look
            
            
            
              at those vulnerabilities first, then filter vulnerabilities
            
            
            
              by team and by application. Really make them team and application specific. Give them
            
            
            
              context. Give them meaning that they are not just like a line of text of
            
            
            
              something that's wrong with any resources, right? That's ultimately what
            
            
            
              you don't want to have. And that's also related to device engineering blog
            
            
            
              post needs, right? So next
            
            
            
              thing, step six, what are metrics without alerts?
            
            
            
              The thing is, and this is related to what I said earlier about automation,
            
            
            
              that I want to talk a little bit more about automation after I take a
            
            
            
              sip of coffee.
            
            
            
              Sorry, my throat is still a bit messed up from a cold.
            
            
            
              So basically, when we define
            
            
            
              our deployment resources, you need to define your deployment resources to deploy
            
            
            
              your application, right? That's a necessity. Otherwise your application is not deployed,
            
            
            
              it's not working, customers can't access it, customers are unhappy,
            
            
            
              right? You don't want that. So the thing is,
            
            
            
              you then need to obviously define those deployment resources. But the same doesn't
            
            
            
              hold true for security, right? You don't need to define
            
            
            
              how you scan your resources, you don't need to define like scan
            
            
            
              coverage. I don't know, all those things related to security you don't
            
            
            
              need to do to deploy your application, to have it working to make customers
            
            
            
              happy. Customers are only unhappy when things go wrong in the security
            
            
            
              world, right? Like when their data is ultimately exposed.
            
            
            
              So it's not a necessity for engineers, for anyone
            
            
            
              operating an application, operating a business to actually take care of the
            
            
            
              security of that. I mean, most of the services that you use online, you probably
            
            
            
              don't know what kind of critical vulnerabilities are within, and you
            
            
            
              shouldn't have to care about that. That's something for the business to care
            
            
            
              about. But that's exactly why you want to set
            
            
            
              up alerts and make your vulnerabilities scream at you,
            
            
            
              right? Give them a voice, make them,
            
            
            
              set them up in such a way that you cannot ignore them.
            
            
            
              So once you do that, you can correlate
            
            
            
              metrics. So, for example, if you have a new critical vulnerability here,
            
            
            
              new vulnerabilities introduced,
            
            
            
              we can then correlate that dashboard from our vulnerabilities,
            
            
            
              from our misconfiguration issues that went up with our
            
            
            
              deployment dashboards and see, okay, how do those, what happened
            
            
            
              in our cluster, there's a new deployment,
            
            
            
              there's a new replica set, okay, that caused
            
            
            
              the vulnerabilities to go up to have more inside of the cluster.
            
            
            
              Step eight is some additional tips that you can do,
            
            
            
              and some are iterating on the previous ones that I already mentioned. First one is
            
            
            
              assign ownership, really make it somebody's responsibility.
            
            
            
              And ideally, the person who's already managing that resource should look at its
            
            
            
              vulnerabilities. Don't introduce tools, many new tools
            
            
            
              at once. That's something lots of people want to do when they get started with.
            
            
            
              For example, cloud native security is implemented everywhere and everything
            
            
            
              at once, and that's complete overload,
            
            
            
              and people are likely not going to be able to adapt to
            
            
            
              those new processes. The next thing is utilize existing workflows,
            
            
            
              platforms and processes. Utilize it as much as
            
            
            
              possible because it makes it easier for people to actually look at the security
            
            
            
              reports. In that case, step nine is optimize based on what
            
            
            
              works for your team. A lot of times we can follow
            
            
            
              the initial setup, follow whatever company said,
            
            
            
              but ultimately every application will
            
            
            
              be differently deployed depending on your environment. A lot of times when I get questions
            
            
            
              about trivia operator specifically and its
            
            
            
              deployment,
            
            
            
              I cannot answer those questions before I get more information on
            
            
            
              your setup, on your environment, on your needs, on all
            
            
            
              those different pipes that play a role right because
            
            
            
              ultimately my answer will defer based on how
            
            
            
              your setup looks like and what applications you're
            
            
            
              already using in simulam. So there's really no one thing
            
            
            
              works for everybody. And step ten
            
            
            
              downstop at security scanning there sre lots of different types of
            
            
            
              security tools in the cloud native space.
            
            
            
              So for example, Tracy is a runtime security and forensic tool that
            
            
            
              analyzes events on the node level.
            
            
            
              So it can basically, while Chevy can scan any
            
            
            
              misconfigurations once they have happened inside of your cluster,
            
            
            
              Tracy can detect if somebody uses a misconfiguration to do something they
            
            
            
              shouldn't do. Those are the main differences. So here you can see just
            
            
            
              a dashboard of its different logs. Now you would want to obviously
            
            
            
              filter them more in different ways to actually then have
            
            
            
              actionable steps to those logs. Because over 2000 logs,
            
            
            
              that's nothing you can really follow up on.
            
            
            
              And here sre some of the resources used the
            
            
            
              blog post from wise Engineering on their application security journey.
            
            
            
              Then on the AG for open source YouTube channel we have lots of different tutorials
            
            
            
              to get started with. Here's the trivia GitHub repository and the
            
            
            
              trivia operator repository. If you Google trivia trivia operator, you should
            
            
            
              find it as well. And here's a demo project that I've been using
            
            
            
              on GitHub as well, and you can find us on
            
            
            
              slack if you have any questions about this presentation,
            
            
            
              about anything I said, or about trivia and other
            
            
            
              projects within the aqua ecosystem.
            
            
            
              Now, I hope we have some time for questions. Otherwise, thank you so much
            
            
            
              for attending my talk. I hope you have can amazing rest of your day and
            
            
            
              to see you soon.