Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hey everyone.
Welcome to Com 42 site Reliability Engineering conference.
My name is Peter Deten and I'm gonna walk you through Azure Load Testing.
Now, before diving into the actual topic, let me share a bit about myself.
So Peter Deten, originally from Belgium, but moved to the Redmond area about
three years ago, where I'm working as a Microsoft technical trainer at Microsoft.
I think it's still one of the easiest job titles within Microsoft, because
what I'm doing is providing technical training on our Microsoft Azure platform,
primarily to our top customers across the globe using virtual online workshops.
My career started about 30 years ago as a physical technical engineer,
literally building physical data centers, the cabling, the cooling.
And then gradually, as part of my career moved up the solution stack, you could
say, all the way from that physical data center, networking, storage servers,
and then gradually moving up to cloud, where around 2013, picked up Azure for
the full a hundred percent, and then moved a bit from cloud engineering to
cloud, architecting a lot of DevOps automation and gradually developing
N AI over the last three years.
You got all my contact details here, and feel free, I would say to reach
out if you should have any questions while watching the session, or
obviously if you're watching this later on as a recording, then more than
happy to provide you with an answer.
Now, the starting point is what is SRE right now, knowing that the core
topic of the conference today is all about site reliability, engineering.
Now, where does my Azure load testing fit?
So to me, SRE, if you think about what you're doing, and again, don't
feel personally attacked about this, it might be totally different in your
case, but to me, site reliability engineering means you gonna spend a
huge amount of time on stress testing.
Why stress testing?
Because your goal as an SRE is making sure that your environment, your workload.
Within Azure, outside of Azure, AWS, Google Cloud on-prem, not important.
Where the workload is running, your goal is making it as high
available, redundant as possible.
The way to do that is typically hammering your environment using low testing.
So to be the perfect example to talk about low testing in Azure
as part of the SRE conference.
So what is low testing?
First of all?
This is a sort of, I think formal definition that I found online.
Load testing is a type of performance testing that determines the
performance of a system, a software, an application, or a full workload if
you want under, and that's important.
Real life, real workload based load conditions, which typically
would be the number of users.
Number of sessions, maybe the amount of traffic.
If you have a lot of streaming data, like a Netflix maybe, or something
similar streaming media, where all of a sudden the traffic could become more
important than actual number of users.
From here, we move into Azure load testing.
So obviously the assumption here is that you're gonna run your workload.
On top of the Azure platform, where now we have a service in the platform,
managed service called Azure Load Testing.
Yes, every now and then Azure service names actually do make sense.
Azure load testing is a managed service that can be used to, again, simulate load.
Preferably, I would say you're gonna run this against your production environment
or even more so you're gonna run this in your test acceptance staging environment.
Because you don't want to impact your production all the way from there,
you're gonna target application user interface, like an app service, a
web application or a virtual machine.
And why not API Endpoints?
As long as you can target you, your backend using A-C-T-P-S,
you're actually good to go.
Now there's one part two testing, like going in the portal.
No surprise.
I'm gonna walk you through some live demos there where we have
an application up and running.
You're gonna integrate Azure load testing, you're gonna use the Azure portal for now.
You're gonna click around and then you're gonna build your test.
But that's like a one-time shot or a manual shot out of SRE.
I guess we can all agree that a big part of your job is also automating your tasks.
So that's where the second bullet point becomes quite important.
You're gonna build your tests initially manual and then gradually
automating the whole process.
And the good news is you can run trigger Azure load testing
as part of your CICD pipelines.
A lot of flexibility where you're gonna define test criteria, and I'll show
you again in a demo how that works.
And then also crucial to mention here is that initially, two years ago,
low testing in Azure was based on the Apache Open Source J Meter framework.
In meantime, it also supports Locust where you're gonna hear about some
differences between the both scenarios.
But the cool thing is you manage the Azure environment.
You're gonna bring in one of the two options to really configure your testing.
Next to that, what do we need as part of our load testing definitions?
I call this here, the load testing components.
You need a load testing resource, literally a top level resource
providing you centralized place to view your testing.
What it means is go to Azure load testing as the service, and that's becoming your
dashboard for all testing scenarios.
Next to that, obviously you need a test, you need a test run.
A test means it's a technical definition.
The run means we actually gonna.
Execute the test.
The test engine is the compute infrastructure in the backend.
This could be your target instance.
And then last, you need an application component.
Again, any possible HCTP, htt, PS reachable backend from within
Azure would be a possible scenario.
Next, we have load testing test itself.
So again, URL based is one option.
The easiest one I'll show you.
As the starting point of my demo, give me your RL and I'm gonna simulate
testing, obviously not anything that's running outside of your control.
Second option to maybe make it a little bit more standardized or maybe even a
bit more advanced if you want, is using that Apache J meter testing script.
You're gonna build your testing typically offline, not really using the Azure.
Portal anymore, or why not?
Maybe you have a mixed environment across different clouds, but also on-prem.
But now you're gonna rely on that JMeter framework.
Now, the JMeter framework has some specific characteristics
to it, like the GMX files.
You need like A XML YAML alike scenario where you.
You gonna provide all the elements like the controllers, the timing
the assumptions as assertations and listeners that you wanna
use as part of your testing.
Now the second option, or the third actually, is using Locust as a, I would
say, popular testing framework as well.
The big difference between J Meter and Locus is obviously it's not an
Apache framework, but next to that it also relies on Python and coating.
Instead of the GMX files.
So same concept, slightly different framework.
That's the summarization of it.
And then what I also mentioned during the introduction is now we have our
tests and we're gonna automate it.
Where the good news is that at least out of the Microsoft DevOps solutions, which
means Azure DevOps, or GitHub actions, you can integrate your Azure load testing.
Scenarios as part of an already pre-built marketplace component.
You build your pipelines in the exact same way as you already have CI or cd.
Why not combining?
And in Azure DevOps, you're gonna bring in a pre-built marketplace
component called Azure Load Test if you're no longer using Azure DevOps.
But it's all happening on the GitHub side.
That's where you're gonna bring in GitHub actions.
And again, no surprise, you have.
A GitHub actions marketplace component called Azure Forward slash Load Testing,
and for now it's still version one.
And then you go okay, Peter we're not using Azure DevOps,
we're not using GitHub actions.
What are the options?
Good news is since Azure load testing is just another service.
In the Azure platform, it means you can target it, you can trigger it
using traditional rest API comps.
I might not have enough time in this session to show you all three options.
But at least I'm gonna shift to the next part, and that's really
showing you how all this works.
I'm gonna start with deploying load testing.
I'm gonna show you one of my sample backends at just a web app, and
then from there, triggering a manual test or configuring and triggering a
manual test showing you the J meter option as well as the locust framework
to give you some touching points.
And then again, if time allows, I'm gonna show you how you could integrate it.
As part of your DevOps solutions, see you back in a bit.
So here I'm in my Azure portal where I'm gonna use the Azure load testing.
Now again, I call this a managed service, which means you don't
have to install anything.
It's already available within your Azure environment.
You could obviously create a new load testing from here.
But to speed up the demo a little bit, I'm going to use the one that I already have.
If you're a bit into Azure, which I assume you are, it just like any
other Azure resource, you define the location, you provide a name,
a resource group, and off you go.
So that's mainly what I did here.
Once we have our load testing service configured, that's where we're
gonna use some of its capabilities.
You can start with a couple of different scenarios.
The first one I'll use is over here where I'm gonna add HCTP requests.
So I'm gonna select, create a new test, and this is now what
we call the URL based test.
I can provide a name, I'm just gonna accept the defaults, or
let's call this one the live URL Test A description live URL Test.
How hard can it be?
Run the test right after creation and if you want, you can see here
debuting mode, which gonna give you.
Access to debug logging.
If a test is running but it's failing, it's not doing what it needs to do,
then I would say please enable this.
Enable advanced settings is just gonna give you access to more options down here.
Let's move on to the next step.
I'm gonna add a new request, and again, later on, if this is no longer the
first test you're creating or you know how the Apache framework is working,
then you could also start from it.
Input data input data file.
I'm gonna create a new request.
I'm gonna accept all the defaults because it's not all that important for now.
And I'm gonna use input in a ui.
What this means is my request name.
So this would be let's say homepage, HTTP get, and my URL is.
My website, I'm gonna use my production blog website, oh seven fff learning.com,
and what is the A CTP method.
Now, from here, I'm gonna take a minute to highlight a few things.
Right here I'm connecting to the homepage.
Now if, let's say your homepage is your product retail catalog, or
my, in my case, my blog website.
It might all be fine, but what if I wanna go to a lower level?
Then from here you could actually add the subset.
Like in my case, I have a section on my blog that talks about the
books that I wrote in the past.
And I wanna run a test against my book subsection and not the homepage.
From here, I could move on like the default would be a get
comment, but as you can see, we're respecting all rest API codes.
Imagine you have, again, like a product retail, you wanna test
the performance impact when you're uploading products, when you're
ordering products, when you trigger a payment, when you're making changes.
To decide and so on.
That's what you can define here, but I'm not gonna make it that complex in
my demo scenario, so I'm gonna trigger, again, you can fine tune query parameters,
like injecting specific variables.
You can tune the headers and maybe even looking into response variables.
Again, just showing you the options.
What are all the capabilities, and we can make this super straightforward.
Super easy or rather complex.
Nothing blocks you from combining multiple tests.
So I could do second URL request, where now I could go back to the same endpoint
or why not going to a different endpoint.
So this could be my sample, Azure web app, azure websites.net.
'cause what if my website homepage has dependencies on websites and
again, using some other URL rest API call here and then building up
a flow of tests on the request site.
It's gonna run the first request, the second or third,
how many you want to add into it.
That's up to you.
Clicking next onto the next cycle where we can define different
parameters, environment variables, hopefully pretty obvious based on a
variable, this could be dev and test staging production using secrets.
Imagine that you wanna connect, like again, the retail example I mentioned.
Maybe you wanna connect to a database, you wanna interact with Key Vault, you
want to interact with connection strings.
You can define it here, and obviously the same for certificates.
If you interact with Key Vault, our Azure Secret Store.
The idea is obviously that your testing framework, your running test now also
needs permissions to interact with key vault, and that's what down here.
We're preferably following Microsoft best practice.
We're expecting you to define managed identity.
Define role-based access for this specific test.
Getting the necessary permissions to hook can to key vault at the same time.
It's not only locking down permissions, but it's also defining them in a granular,
zero trust approach, you could say.
Moving on to the next, that's where we're gonna define the actual load simulation.
Number of instances, like how many simulations, but also how much
compute, and that's where the scaling details here would be beneficial.
Each and every engine allows you to run up to 250 user sessions.
So if you have I wanna simulate thousand users, it means you need four instances.
I'm quite limited, although technically I could move this up, but my
subscription doesn't allow me to do this.
Physically or technically.
So I'm gonna stay with two.
I'm gonna move up my numbers to 350.
When now it tells my go, wait a minute, it can only be two
50, so that's a bit confusing.
Two 50 here means two instances.
I'm still gonna simulate 500 users.
So just trying to clarify the portal a little bit.
I want linear growth.
I want a step by step growth, and you can see that the parameters are changing.
And you could also simulate Spike where we start with 250 concurrent
users and then gradually spiking like a simulating, literally a CPU spike.
Or a user spike, like we got more users coming in because we offer like some
online food ordering platform, and it's between six and seven we see a spike For
breakfast, it's between 1130 and one, we see a spike for lunch ordering coming in.
That's the scenario you could simulate here.
I'm gonna keep the default of linear and that should be good enough.
Do you want to integrate monitoring?
What this means here is that we provide you monitoring dashboard,
but now you could actually hook into other Azure scenarios.
You could use application insights, storing data in there, you could
configure some other options and you can see it from here.
Moving on, where now we can define even more specific test criteria.
You can define the conditions, like if.
For this specific metric, like response time, there's an aggregation.
If the condition is true, do this, do that.
So again, showing you more options, more granular configuration parameters.
The baseline technically would be the same thing.
All that summarized in the last overview, and we're gonna create our
test now, although I started from the A-C-T-P-U-R-L based, you can see that now.
By design, it's using a JMeter script.
What it's technically doing here is transforming all my
settings from the portal into that JMeter framework language.
It's gonna take a couple of seconds to actually create a test, and
once it's done, we're gonna move over and we have our testing ready.
So to speed up my demo again a little bit, I already have one from two days ago.
As you can see here.
I got my test, I got all my scheduling done.
I'm not scheduling anything, but that could be another SRE trick here to
build a test, run a test, but then also scheduling the test, run this every
hour, run this overnight and whatnot.
And then last here, we are gonna run our test itself.
When we run the test, this is already done because it's gonna
typically take five minutes, 20 minutes, depending on the timeline
of the testing you wanna go through.
This was a pretty short one, only two minutes because I don't want to wait
for it to just show you some results.
And what it gives me here is that linear growth, so this one
was a bit smaller, move up to 50 users, start small, gradually.
Move up and then hammer my platform.
I used HTTP get, and then it's gonna show me the outcomes of it response time.
In this case, the number of requests, and then obviously the go-to scenario
would be to add an additional level of tests, changing the parameters.
Maybe running the test again with a different app,
service backend and whatnot.
And then it also shows me some service side metrics, but
this has not been configured.
What I would need to do here is what I showed you before,
enabling application insights.
Application insights is where you're gonna hook into Azure Log Analytics,
allowing you to store all the metric data, not only visible in the portal,
but actually storing the actual telemetry information and allowing you to again,
do data mining data viewing out of that.
And that's where the server side metric comes in because it's relying
on some other framework there.
Next to that, I can download my input file, the logs, the results.
I'm gonna show you the input file.
Why?
Because this is what I need to show you that JMeter framework.
It's not the best tool, but I'm gonna use it for now.
Just opening an easy notepad document and then I could normally.
Not drag and drop.
Obviously not because it's what am I doing, notepad?
No, Peter, you don't need notepad.
You need Visual Studio Code to do this,
and then I guess I can drop it.
It is in my downloads.
It looks a bit weird, but eventually it shows up as we expect it.
So this is the J meter config file.
So it has the display name.
What kind of test plan?
And as you can see here, there's a pointer to the actual JSO file.
It's what you see here looks a little bit awkward because for some reason
it's not respecting the JSO layout, but you can see that it has that syntax.
And then the last one is the actual GMX file.
Oh, there we go.
We could do actually notepad, but that's fine.
And this is that XML structure that I talked about before.
So it's gonna show you like all the details.
Let's see if I can find some of the settings from my configuration here.
So it's a CTP, that's what we picked.
We had a CTP get.
That's fine.
And this is where I would say go into the documentation of the Apache J Meter.
The nice thing again is that now since we exported this, I could technically close
this because I don't need it anymore.
I could now go in and actually upload a new test.
So if I take one step back, create new test, upload a script, it's gonna ask me
like, okay, where do you have your files?
It's JMeter.
It's in my local file system.
It's in my repos.
It's anywhere you wanna store it.
And then you don't have to go through the portal anymore.
So URL Test JMeter, it's technically almost the same thing.
I would say use that JMeter scenario for the more advanced configuration options.
The other example we have is Locust.
So again, what is Locust?
It's what down here.
I got another little demo that I built.
Everything that I talked to you through is exactly the same thing.
I'm selecting a test, I'm running a test.
The main difference here, it's using a different framework.
The nice thing again, is that within our Azure load testing,
it's using the same Azure approach to give you that capability.
Same testing, same ease of use, same advanced configuration, more granular
options, just using a different framework.
You might have used JMeter in another cloud, in an on-prem world,
you might standardize on locust.
That's the philosophy behind it.
What is the main difference?
I would say it's the configuration.
This is my Locust Python script that I used for this test, and as you can see
here, it's using Python scripting language to define the details of your scripting.
It's connecting.
Two, my workload, it's a get workload.
It's pulling up a to-do item, and then it's reading out a response to
validate if the response was 200, retrieve the response, and so on.
Using Python, traditional ping scripting language.
Running the test, seeing the output of the test is the exact same experience,
so loading 500, about 500 users.
Four or five minutes and then reading out the results.
And again, the dashboarding views are technically the same thing.
Next I talked about using DevOps mindset.
So the example I have here is connecting to DevOps.
I could show you GitHub actions as well, but I'm still.
A lot more into Azure DevOps, I guess out of my background, been playing
with DevOps for I dunno, 10, 15 years.
So Team Foundation server was my playground.
I got my to-do items, the one that I published in my pipeline.
And then in here,
what's relevant, apart from the fact that it didn't run successful,
but it's because I didn't configure the correct permissions.
This is one single stage.
So this would be part of your CI build pipeline, right?
Where you're gonna ci ICD pipeline, sorry, where you're gonna define
the service connection into Azure.
What is the name of the load test?
What is the resource group?
And then from there you're gonna use this pre-built Azure load test concept.
It needs a load testing config file, so the YAML file from JMeter
as I showed you the resource group.
And then from there, not really needing anything else.
Super straightforward.
How can you install this extension?
You go into the Visual Studio marketplace.
I think it's marketplace dot visual studio.com.
There we go.
And you're gonna search for Azure load testing.
It's gonna ask you like, where do you want to install this?
As part of what Azure DevOps organization.
You download it, you install it, and that's where it becomes available.
If you go okay, that's all cool, but can you show me how
to do this in GitHub actions?
Although I don't have it up and running configured, you would go into repo,
you would go into actions, and then here you would search for load testing.
I think it's one word maybe to load what happened here.
Load.
Oh, it's not the full action.
Obviously, it's a component within, it should be here,
Azure load testing,
and then picking the right version or the latest version.
That's up to you.
And then as you can see from my slides, Azure load testing and then asking
again for the same configuration data, the JM Max file, the YAML file, the
config file in JSON format, which most probably is already part of your
repo anyway, and then running your pipeline just like any other component.
The last scenario we have.
I have it somewhere on the site is running it using traditional rest API
calls, and I'm just gonna grab the little sample code that I have for that.
Just give me a second here.
That's what happens when you do all this live, right?
It would look a little bit like this.
So first you're gonna authenticate into Azure Next, it's gonna
create that authentication token.
And from there, traditional rest, API curl post get anything you wanna do with it.
HT TP, oh, sorry.
This is obviously the post because you're triggering.
The load testing itself point to your load testing endpoint using the token and then
defining the header and the actual details where again, it's pointing to your script.
GMX file.
In this case, it's a JMeter example.
And then running the actual test is again running curl post.
And then you're gonna run the actual test cycle here.
Three different scenarios, three different options, pretty much anything.
I wanted to talk about in this 30 minute session.
Looking at the clock, I can see that I'm almost running out of time, so I'm
gonna switch back to the presentation and closing with the last couple of words.
Awesome.
So that worked pretty fine.
Now we're close to the end of my session, but allow me to actually fine tune or
close my session with a couple of last words where one of the last things I
wanna highlight here is that Azure Load testing is only one of, I would say,
our Azure mission critical components.
The second option, as you can see here, is Azure Chaos Studio, allowing
you to do chaos engineering against your Azure running workloads.
If you wanna find more details about how this technically works, what are
the Microsoft guidelines, the best practices on building out this sort
of mission critical workload scenario?
I would say go to this GitHub link where you find all documentation,
including samples, including sample testing and chaos experiments.
With that, I'm at the end of my session time for now.
So thank you for watching.
Thank you for being here again.
Enjoy the rest of the SRE conference here today.
If you've got any questions, don't hesitate reaching out by email
on Twitter, maybe on LinkedIn if you wanna stay in touch and
enjoy the rest of the conference.
Thank you for now and see you again at some other com 42
conference in the near future.
Enjoy the rest of your day.