Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello everyone.
Good morning, good afternoon, good evening, wherever you are from.
Thank you for tuning into my talk today.
A quick disclaimer before we get started.
This presentation includes some research points and also some
demos, which are purely based on my view and as an attendee.
You are encouraged to perform your own research and any professional
before you start implementing this within your organization.
Speaker introduction.
My name is Anto Pel and I'm a cybersecurity professional with
background in computer science and master's in cybersecurity.
I hold many industry certifications like C-I-S-S-P-C-S-S-L-P-C-A-S-P.
Some of these are industry specific security certifications and, I basically
like, enjoying challenging myself and I was able to acquire all these
skills when I was working in different organizations in different roles.
And I'm also like multi-cloud certified in all the three
clouds across Azure, GCP and AWS.
And I am a lead cloud security engineer at Humana.
Where I'm leading some of the cloud security initiatives
across the enterprise.
And I'm basically passionate about threat hunting and capture the flag,
even some challenges because they give me an opportunity to think like a bad
actor so that we can start building controls in a more effective way.
And I also contribute to the home automation project, and I like
connecting with people talking about home automation, cybersecurity insights.
I. General automation, any of such please reach out to me on LinkedIn
if you would like to connect.
And also you can use the QR code, which is out there to
basically connect me on LinkedIn.
Moving on to our topic today, which is cloud security poster management
and cloud native integration.
So we are gonna be doing a deep dive understanding what cloud
security poster management is.
And specifically focusing on cloud native integration.
So what we are gonna be talking about is how do you secure the cloud with cloud
native provider capabilities so that whenever any kind of like deployment
goes through the cloud, you can secure it either if it comes from your IAC
or even if it comes from the platform.
So let's see our agenda today.
So we are gonna be starting with understanding threat landscape,
which is out there, right?
Because this is gonna give you some understanding about the growing complexity
inside the cloud and what are some smart things that you can do to secure it?
Understanding building blocks of what?
Cloud security, PORs management, what really makes all these things.
And understanding infrastructure as code security understanding, what
that is and why do you need it.
And we are gonna be doing a deep dive into cloud native security policies
today, understanding how do you implement it, and also like understanding
what are the need for having those.
Policies and the CSP specific level.
And then we are gonna be doing a deep dive into event driven security,
architecture, security, portion management, exception management, get an
understanding about what are all these three things with some key takeaways.
Moving on to the next slide where we are gonna be talking about current
cloud security challenges, right?
As you might have all seen the cloud workloads, the utilization of cloud is
going up, and the cloud in general comes with shared responsibility model, right?
Where cloud provider is responsible for few things.
As a customer, you are responsible for managing few things, especially
securing those configurations, which the cloud provider would.
Give you access to, you need to ensure those are properly secured, right?
If not, you can see the growing attack surface based on the
research published out there.
The attack surface is growing up, right?
And because the attack surface is going up.
Monitoring it and also ensuring proper configuration is getting
deployed is a complex task.
We need to ensure we are driving automation to drive.
The configuration complexity, right?
Because without that, it's gonna be really complex for us to start like
configuring all these things at multiple different places and basically lose
the track of what we are monitoring.
These all agonist.
And that might lead to some gaps and that's where the attackers could
leverage that part to start at the attacking some of the things that, they
are basically running in the cloud, so we're not, when you also look into
the compliance requirements, right?
The fines are rosing.
The reason why is, when you run some of these things in the cloud or
on-prem, and when you are in a specific sector, you should be meeting some
of these compliance requirements.
For example, you need to understand how data should be segregated.
What is PCI DSS and how many days you can basically do retention
of some of these records.
So with all of these things, if you're not doing compliance in a proper way
or format, even if you have a breach, or even if it is caught in an audit or
if it's caught in a breach, and then that's where the discovery happened.
That would lead to huge fines.
And as you can see, the fines are going up and up.
The reason why is the complexity and interpretation of the requirements, you
know how those things are done When we look into the foundation pillars, right?
Inside largee poster management, we have basically four pillars.
The first pillar where we start with is IAC infrastructure code scanning.
Where we proactively scan for any kind of security
misconfigurations within the pipeline.
And the second aspect of that is, or the second pillar of
that is cloud native validation.
So this is security validation layer where, which is closely tied
to the cloud security provider.
And this is the area on focus that we are gonna be doing a deep dive today,
understanding how to configure it.
What are some of the advantages with it?
And then we are gonna be doing a deeper dive into runtime remediation.
So runtime remediation basically comprises of looking even in real time triggering
functions to correct it if there is a misconfiguration, which was failed to
be detected in one and two pillars.
And next, stitching all these 1, 2, 3 together is compliance
reporting on how exactly we are.
Showing compliance and monitoring it is really key in understanding the attack
surface and as well as the regulatory compliance because understanding what
is out there and how they are know the resources are running is really a key in
coming up with your different strategy.
Moving onto the next slide about infrastructure as code security scanning.
Think about what you can do at this layer is you are basically enabling
predeployment scanning in this phase where whenever your IAC code goes through the
pipeline, you're enabling this feature so that people or developers can get
early guidance and you have a policy integration there within the pipeline.
So that all these misconfigurations are caught so that nothing is going
out there inside the cloud basically is in a misconfigured format.
A lot of times these all come with integrations with VCS systems where
anytime when you submit a pull request, it automatically scans with specific
set of policies to ensure it's meeting compliance, and then it moves on to the
next stage to merging into the main.
So we did a deep dive into this topic, so you can use the link which
is out there or scan the QR code.
So we did a deep dive into this topic.
So what we did do is we have shown how to use an OPA agent for
performing and authoring ISE policies.
So that you can basically stop misconfigurations before
entering into the cloud.
So please watch the video.
That should really help you to understand how do you prevent any misconfigurations
before it goes to the cloud.
So integration is very important in the DevSecOps model, so that you are
enabling feedback so that anytime when people know about this misconfigurations.
Developers should basically get a complete understanding of what is expected
value so that you don't have to chase behind some of these things during the
runtime and remediating these things during the runtime has potential issues.
With, configuration being drifted, or sometimes what we see is vulnerabilities
are high, which needs to be remediated.
We tend to go and remediated, the service goes off in a different direction.
So it's really important to fix this in the shift left approach because
fixing them during the right time on the shift right side is really huge.
The cost to remediation is very huge, and what really helps here is
having that frosting collaboration.
With your developer community security teams to understand the pattern
by enabling the DevSecOps model.
Within your organization would really help you to do that.
So this is a topic of focus today where we are gonna be focusing on
cloud native security policies where we are gonna be looking at GCP org
policies, Azure policies AWS SAPs to understand why do we need it and what
is centralized policy management.
Because a lot of these policies should be managed in a way like they're centralized
so that you get maximized value.
And how do you automate it so that anytime when a new resource goes into
the cloud, it basically gets scanned and the policy gets basically enforced.
So today during the demo, what we are gonna be doing is we are gonna
be looking into two specific cloud providers, Azure and GCP, right?
We are gonna be creating a mis complicated resource in Azure.
Then we are gonna be authoring a cloud native security policy in Azure and then
redeploying the resource to check and see.
What is the difference, right?
How exactly you're able to stop the misconfiguration.
And the same thing in GCP two.
We aren't gonna be reviewing some of the policies on how do you enforce it.
And we are gonna start like creating some virtual machines which are.
Not appropriate to be deployed.
And you can see how basically within GCP those are getting blocked.
So if you can see my screen currently, so we are inside Azure Cloud.
So this is Azure Cloud.
Right within the Azure, what we are gonna be doing is we are gonna be, so within a
subscription, we have a resource group.
We are gonna be going and creating a resource called Azure Storage.
So this resource is basically used to store any kind of like.
Data within the cloud, maybe like some kind of block access kind of data.
Like it could be like, files or any of such.
So this is a configuration which I was referring to where you're trying to
create these things within the cloud.
It's gonna ask you, so for this one we can give a demo.
Name him,
and.
The name should be unique and we are gonna be doing as we can see here,
these are all the configurations.
So what we are gonna do is we are gonna intentionally misconfigure this
particular storage account, right?
So you see I'm selecting minimum TLS 1.0.
Just not, it's an insecure version.
So at the same time, on the networking side, this has enabled.
To get access across the internet, right?
The firewall is basically opened, so anyone from the internet can access it
if they have proper keys at the same time, this is what the data prediction
looks like for soft delete and encryption tag, and I'm not gonna change anything.
I'm just gonna create it as you can see.
We have submitted it and the deployment is going through.
Let's wait for a couple of minutes and let's see how the deployment happens.
So what we really are doing here is.
We have intentionally misconfigured this resource, right?
So this resource is, as you see now,
Microsoft basically accepted it as you basically accepted it.
Hey, this is a configuration.
Looks good.
Now I'm gonna start like creating the resource.
And if you go to the resource here, you can see.
Some of these settings, right?
It's recommending.
It's recommending, right?
You need to have TLS 1.2, it's not enforcing it, right?
And if you also look into some of these things, right?
Networking, you have enabled from all networks.
And if you have shared access keys and basically enabled for all networks
is enabled, then think about it.
Like anyone with access to the key can access it from any place.
Rather than giving access from a specific subnet or having from a specific
location same time with data protection.
As you see it comes with the default things.
And
if you look into configuration settings, so the minimum TLS version is 1.0.
So we know this is a misconfigured version, or I mean at least a
version, which is not greatest from security perspective.
So we wanna make sure any storage accounts moving forward are basically
enabled with TRS one or two so that at least we maintain that security.
So what we do in Azure is we have something called as Azure policy.
So this is basically gonna help you to write and author those policies which
are required for ensuring compliance.
So anytime when you submit anything for deployment, this is where you
can basically author policies to say how that deployment should look like.
So what we're gonna do here is we are gonna be doing a deep dive
into policy authoring and how that policy is basically applied.
Let's do PLS 1.2.
And Azure has, a lot of built-in policies, right?
So when you look into some of these policies, it, you should really be
able to understand how those things are authored and how you can do it.
So you can also do a custom policy definition if you have a
specific customized requirement.
But for this demo, what we are gonna be doing is we are gonna
be creating a storage account.
Policy with minimum TRS version.
So if you look into it, what the policy is basically saying is stating is the
allowed values are allowed and denied and disabled, which basically means like
what's the policy, how it should look, be like, if it's an audit policy, it's just
gonna audit, it's not gonna do anything.
If it's a denied policy, then you know it's basically gonna deny if it's
not meeting any of these parameters.
So I'm gonna apply this policy as a denied policy.
And I'm gonna be setting a scope to the working place
where we are currently working.
And what I'm gonna do is I'm gonna be setting the parameter here to
say, Hey, if it is not t ls 1.2, then the deployment should be denied.
So I'm gonna set it to deny and I'm gonna add a little bit of a. Message, which
basically says this is part of testing.
Just so that you can customize this mess message however you
want it so that developers can basically get some understanding
of what needs to be selected.
But this demo is putting this and looks like there is a
scope issue, should be fixed.
Okay.
If you look into it so right now the policy is basically created and.
If you check here into the assignments, so this is the policy that we
applied at that specific scope.
Right now, what we are gonna do is since we have applied this policy, we are
gonna go again under this resource group and start creating a storage account.
At this time, if I'm selecting the misconfiguration, it
should potentially deny.
So if I go to advance, I'm setting it to 1.2 intentionally.
So that, to see what happens if you move forward with the deployment.
So
again, so this is going through the deployment phase at the minute.
It's evaluating to ensure hey, is this deployment, meeting the standards?
And then within couple of minutes we should be able to see the results around
how the deployment is going forward.
I.
So if you go to the resource, the, you can basically see like this particular
storage account was created, but let's quickly check the policy, right?
To see that particular policy is active.
Okay, let's.
So the name of the policy was TLS.
Okay.
Let's quickly make sure, okay, this is an audit policy currently, so as you
can see, if you view compliance, you can see there is one non-compliant resource.
So this is how the audit functionality works, and this is what it is gonna show
you if the specific storage account is not configured the way it is supposed to be
configured, but not what we are gonna do now is we are gonna go back, change the
configuration of this from audit to deny.
So that it actually denies it right away when the misconfiguration
is provided to the cloud provider while creating the storage account.
So let's create another storage account here.
Yeah, you can just pick a random name.
I'm just gonna give something like this because I just for testing
and I'm gonna give TLS 1.0 now, making sure this gets, yeah.
In the proper resource group.
Yes.
I'm just gonna leave everything and the validation is in
progress, so as you can see.
This was delighted by the policy, right?
This is part of SRE demo, which we are just basically just put it right
there, and as you can see here, what we are seeing is, the TLS one
machine 1.2 is not selected, and that is the reason why this particular.
Configuration, or this particular storage account is getting
blocked from getting it deployed.
So if I go back and change the configuration to 1.2, which is a
compliant configuration, it's gonna perform the validation again, I. Right.
And now it's giving me option to create it.
So this is gonna give you some understanding into, how do you use
those policies and how do you author it?
And when you look into Compliance pacs, right?
When you go back here, look into Compliance pacs.
So you would be having, for example, if you're coming from a specific
industry and you would like to basically enable a pack, for example,
FedRAMP or ISO or NIST or any of such.
You can basically search by the compliance pack here and you can see a bunch of
list of all policies which are applicable to that specific compliance standard.
And you can basically assign us initiative so that all these particular
policies would be assigned at that specific scope so that you can
basically start monitoring it to a specific compliance level based on the.
Requirement.
So that is what we did inside Azure.
If we quickly go back to the notes here, so what we did here is we basically
checked like the account like storage account to ensure it's meeting a
specific requirement and it is also like ensuring like without TLS 1.2.
The storage account would not be deployed.
So now what we are gonna do is we are gonna do the similar activity inside
another cloud, which is like GCP, where you're gonna be getting some understanding
into how do you do it and where exactly you need to do it inside IEM.
And you'll be getting an understanding about what those are.
So if you see my screen here.
So let's log into GCP, right?
And here what we have here is a project.
So anything that you deploy in GCP is part of a project.
So when you go to I am, and when you click on organizational policies.
So this is exactly where you come and author those policies.
So these are constraints where you can basically start creating constraints
and forcing them to ensure, like any misconfiguration would not go through, but
example here, I'm gonna come to storage.
So these are some of the constraints that you can enforce so that if the
storage account is not meeting any of these parameters, for example, you
say public access prevention, right?
You basically manage the policy.
Just edit the policy here so that it is enforced.
So anytime an storage account is created, if it is open to the
public, then it is gonna block it.
So let's see some of the already enforced policies here.
So for example, there is something called custom machine type, which
basically means like, if you're trying to deploy any kind of virtual machine
and if that machine is not meeting a specific kind of skew levels.
Then it is basically gonna be denied.
So let's test that scenario out here.
So I'm gonna create a virtual machine.
What I'm gonna do is I'm gonna basically come and say, I'm
gonna create a very big machine.
I.
And let's see if this gonna, let's see if this gets created.
Okay.
As you can see here, there is a compliance violation.
It basically says Hey, this particular machine cannot be provisioned because it's
not meeting the compliance requirement.
Where SKU sites or machine type should be in a specific sku.
Or a specific type.
So what we'll do is like we again, go back, we'll again,
try to create a new instance.
Let me pick something.
Okay, now let's do the same thing.
Let's create instance.
I've just put everything as default, haven't changed anything.
This is the minimum skew, like where it is allowed, as you can see.
Now, this is going through with the deployment without any issues.
So this is how you would enforce those cloud native policies within GCP.
So within the organizational policies here, right?
So coming back to the presentation.
We have reviewed two use cases, one in Azure Cloud, one in GCP, to understand
how these things get deployed and how you could use org policies and
GCP org policies and Azure policies to basically block the deployment,
which is not meeting the compliance or your security requirement.
So you understood what Cloud Native Security is.
How do you enforce it?
Within AWS, we have a concept called AWS Service Control Policies.
That's how you are basically configuring those policies within
AWS as part of organizations.
So moving on to the next one, even driven security automation.
So think about this as a third pillar where we talked about if there is any
kind of misconfiguration detected.
It is gonna be analyzed immediately based on what in the activity locks,
and then an automated response, like a form of a function, is triggered so that
automatic correction happens, and then again, a validation happens to ensure that
particular misconfiguration was corrected.
So that it's not a misconfiguration anymore, right?
So what we need to do here is understand here is understand
what is the MTTR, right?
So for example, if you detected a misconfiguration and how
soon do you want to correct it?
If you would like to wait for a day, or is it something like you
just want to do it 10 to 15 minutes.
So this is something where we need to balance, cost and security because
some of these tools come with.
Cost, especially when you're reading events in real time
and basically deploying any kind of functions to deploy it.
Because as company scales, events would basically, grow and the automation
should also grow, which leads to cost.
So balancing that entity R is really key in enabling the event
driven automation solution.
So as we have seen, how do we monitor everything from a
compliance monitoring perspective?
Real time monitoring is really important.
And the same time mapping all those policies to a specific compliance
framework is really important because what those would give you is give
you an understanding of what the risk is of not staying noncompliant.
I. It's gonna give you a understanding about, Hey, what is the threat out
there if I'm not doing some X, Y, and Z configuration, and if that
threat could be exported or not.
So that's how you need to prioritize it based on the risk.
And anomaly detection is a key part of it, because if you are still doing
everything right, there might be an insider or any of such patterns where.
Something activity, which is deviating from normal activity, is basically
really helpful in categorizing the benchmark and would basically help
you to understand what your risk is when you're basically doing it.
So some of these solutions are now coming up with the AI driven security
insights, which are basically.
Predictive threat intelligence kind of thing, where based on your workloads,
if something is getting deviated, it's basically gonna notify you.
Or if you see a pattern where it, there is a suspicious behavior by someone
who's intentional, or for example, or it might be something, for example, some of
these identities are machine identities.
If it basically identifies some kind of like privileged scope
for example, there is a creep in, permission is required, what is
required and what is country doing.
So all of those really are key in understanding and creating the
baseline so that when you monitor those things it should be able to
detect any of this misconfigurations and you would be getting a better
understanding of what's the behavior.
And these, some of these tools come with AI insights, which
basically see those patterns.
And would you.
On behavioral analytics and automated vulner vulnerability prioritization,
where, for example, if you would like to prioritize vulnerability from getting
it patched, having these insights would really help you, Hey, this is
a public facing application, was an internal application with exploitation.
Whatever that is.
At the same time it's with, some of these scoring models out there, which
really give you some insights into.
What really these codes are, and at the same time with these codes, which one
can you prioritize in fixing it first?
So understanding the roadmap is really important for implementation because when
you pick A-C-S-P-M tool of your choice, you need to understand, where exactly
you're trying to deploy these things.
Is it just one cloud multicloud hybrid cloud?
What are the approaches need to ensure the tool is properly
configured or properly supported to begin with in the initial phase,
and then you properly have policies.
Proper policies defined, and then you will have to start implementing those
CSPM tools for tool implementation.
And you need to have some of these tools come with integrations
like from IAC side to CSPM sites.
For IAC side, you need to ensure it's properly integrated.
Two C-I-S-C-D pipelines where it supports all the orchestration tools, which are
currently running there so that it detects those pattern early on and just blocks
it and for the runtime sake of acuity.
Also, some of these tools have capability so that it can basically
into it, and the pillar tool is what we have seen in those demo.
That's where some of these tools don't have coverage, like
direct integration, and that's something that you'll have to do.
Continuous optimization is something which is really crucial.
For example, policy naming and policy enforcements.
It should be consistent across all these different tools of starting from IAC to
run time and to monitoring to ensure like there is proper integration in place.
So exception management, again, is a key.
Where we understand, hey, we have security requirements, but for some
reason this is a business decision.
Again, we are unable to meet those requirements for X, Y, and Z reasons.
And a business owner has basically taken or basically accepted that
risk so that there is a temporary exception in place while a
permanent solution is identified or.
Maybe that is the way, basically security benefit.
Is really not much, even though if you have those controls enabled.
So that's how we can basically do security exceptions.
And this all should be driven by risk based approach, by following risk
assessment strategies under understanding the scope and ensuring, like what we
have seen is some of these exceptions get started with the scope of a, and there is
a scope creep, which happens to a, B, C.
So we need to make sure prob all of these are currently.
Can exactly monitored in the way they're supposed to be monitored.
And all the documents all the approval should be documented for any kind of
like auditing or compliance requirements.
So key takeaways from this presentation are we have seen the complete journey from
shift left where we have first demoted.
The video for that is already, kind, it's part of the link there.
So please see that.
And what we have now seen today here is leveraging cloud native security
controls and how those controls are built, and how do you embrace automation to
basically deploy some of those controls.
And why do you need a, specific AI driven insights to
prioritize some of these things.
So in the future, at some point of time, we're gonna be doing a deep dive
into exception management because this is a growing area of interest inside
cybersecurity where we see scope creeps and also what some of the platforms
that are coming up would really help in automating some of these tasks so that
you can deploy that exemption at scale.
So you can be doing a future deep dive at.
Third to around this topic at some point of time later.
So that concludes my topic here.
Thank you very much for your time and if you would like to connect
or scan or basically chat, please scan the QR code and I look forward
for, to connecting with you.
Thank you all for your time today.