Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello and welcome in this presentation, why and how Build Your Own Web
Server in C I'm Luka and I will present you as the last project I
work on it, how I did and why I did
before starting talking about the project, maybe some introduction about my
developer journey that we explain you why I starting to be in West about burning my.
own web server.
I think that my journey will feel familiar to many of you, because I first
started to explore IT very broadly.
I learned what is computer, how server work, how internet function, what is
a website, all this kind of stuff.
of course, after that, I would like to create my own website, so I'm
starting to use a CMS to build one.
Then I began to learn code, HTML, CSS, JavaScript, PHP, a lot of other language.
And naturally, the next step for me was to use my coding skill to create my own.
Our own website, but this time from scratch, without any CMS.
but when we have our own website that presents you safe, we need
to think about, okay, what else we can do, what else we can build.
at this moment, you have different possibility.
And I think is where our path start probably to go in different direction.
Most of us, and including myself, the next step is that to build more application.
So you have the CastingTutories project, you have a booking system, and so on.
But it's not the only part.
some prefer focus on infrastructure, and so we underlist server
and services themselves.
But in fact, we could continue to go deeper, like why
stop here building website?
Why not trying to create your own services and your own web server?
the main reason is probably that because we are software engineers.
And as a software engineer, we are focused usually on building
applications, developing new features, implementing business
logic, but not really about systems.
In my career, I made the transition from software engineer to situability
engineer, and it's at this moment that systems become truly important
to me, because like my focus shift Towards service, how to monitor them,
how to automate action to the service.
And it's at this moment that it hit me.
Why I do not build my own web server.
One that will solve maybe my biggest frustration with existing web server.
before the, the first thing.
Okay, is to understand why we want to build a web server,
and not something else.
Okay, it's very challenging to build a system and you have many
things to consider before starting.
first, it's a great opportunity, okay, to learn about a fundamental service.
it will help you to better understand how the current system works and why they
are designed this way and not another.
Like any project, of course, it's a great way to showcase your software
engineering skills, especially if you are a backend engineer, where it
could be more difficult than when you are frontend and can create websites.
but why a web server?
Why not like a database, for example?
the reason it's simple is because a web server is relatively simple,
if you compare it to a database.
but even if it's simple, you have a variety of complex
problems that you need to solve.
So before the whole, let's talk about the what.
Even if.
I think we all know what is a web server, but it's still important to define what
we want built before jumping on the code.
a web server can be described through a few key functionality and non functional.
first, it's a service.
running on the host, is listening on a port, usually the AT.
it can manage HTTP requests and it will generate HTTP response.
So that is for the functionality.
The non functionality will be a web server is secure with authentication,
is reliable, that means it will not crash when it will have heavy usage.
it's performance and have, a good scalability, and, it's not necessary
too, but I had it for my WAMP server, it need to be easy to operate,
especially for administrator or SRE.
The first challenge is, even before you start to build it, the first
challenge, it's about the platform.
Most of web servers today run on Linux, so if you are building one today,
you need to work on that platform.
But at the same time, I personally use Windows, and I remember that my
first web server that I installed, that was Apache with the WAMP stack.
was on Windows, and so I want that my web server be available also for people
that didn't migrate yet on Linux.
So that is the third challenge.
Fortunately, today, at NET Core, you can write your code in C Sharp, and write
and run it both on Linux and Windows.
It was not the case before, so I'm really happy today to be able to do that
in C Sharp is a huge progress for us.
Of course, there is still some platform specific consideration, okay, like
the path, the metric, or everything about the scripting, but we can
handle this challenge with the right architecture and the proper abstraction.
So now, let's see Let's take a look, at the code.
Building a server and connecting to it.
is in fact pretty simple.
What we call a server is essentially just a TCP listener,
defined by an address and a port.
And that's it.
You can start it, and it will begin to listen for incoming connection.
It's so simple that, someone recently tweeted, wait, a web server
is basically an infinite loop.
And while, of course, the real answer is more complex, It's not
entirely wrong, we can define it like an infinite loop and that's all.
The step two, okay, I connect to my server, it's pretty cool, but
to really be a web server, you need to do what a web server do.
That means respond with an HTTP response, otherwise it's not a web server,
it can be another kind of server.
HTTP requests and responses are actually quite simple to understand.
They consist, in three parts.
The status line, the header, and the body.
Each part will be separated by an empty line, so they are pretty
easy to parse or even to read.
to test that, we can create a basic response and send
it back to every connection.
And with a simple, GET request, we can see here the GET request to my localhost
port 8080, we can see that my server a response with hello world, just as
expected, and the HTTP response is understood by the program that call it.
Of course, we need to add more logic to our service.
That means we need to pass the HTTP request that, again, is like
the response very simple in, three part separate by a blank line.
Okay.
with this passing and especially the HTTP request line, we can start to introduce
some logic to handle different requests instead of always returning hello world.
Okay.
challenge number two, underlying connection.
How server receive connection and we need to manage them properly.
So let's set the following route.
a requirement for how connection model, as a component that's under the connection
should be modular to make it easy to integrate, remove or change in the future.
When we receive a connection, we need to decide if we will accept or reject the
connection as a hunter, entire life cycle of the connection must also be, managed.
That means we must accept the connection, but we must also
ensure it gets closed properly.
And of course, the request should be shared.
And an HTTP response must be sent back, because after all,
we are building a web server.
our service, that is the main program, will have, in fact, different modules.
And the first one, we will call it the web module.
This module will include functions like, start, run, Stop, and with the when
function, we will undo the connection.
the previous code is now moved in the when function of the module.
And we can see that when a connection arrives, we, we accept
the socket and we call the accept connection async to manage it.
Connection is the first challenge because it's a consumer resource to your server.
This means we need to set a limit of the number of connections
we can accept at the same time.
So we'll implement tool limit.
the first one is the number of connection we will serve.
That mean the connection arrive.
we accept the connection, and we will do, regarding the path in the request,
what the client asks to our server.
When this limit is reached, that means we are starting to serve too
many clients at the same time, we will start to reject the connection.
But, still with a limit, a second threshold for the connection that
we can't serve, but we can still respond with a good HTTP response.
And finally, if we are really overwhelmed, why would we simply close Or a new
connection without any response.
It's a little brutal for the client, but it's needed for the server because
it really cannot just transfer.
We will start with the connection that we reject, nicely, because it's the
shorter path to So when we receive a connection that we cannot serve, we will
return an HTTP response, and in this case, we will use the status code 500.
This way, we manage the connection quickly, and so we use minimal
resources, but the client still knows what happened on this side.
When we decide to accept the request, the process is completed.
become a little more complex.
So first, we need to know module has a specific attribute called pipeline.
This pipeline is a reference to a function, and after passing the request
and before sending back the HTTP response, we must pass through the pipeline.
when a connection arrives, we have many actions to perform on the request.
and also the response, in fact.
For example, we need to know Maybe, to need, to log that a request arrives.
We need to check for the authorization or the authentication if he asks
for resources that is protected.
We need, of course, to vote the request in the right order.
resources or endpoint, and we can do like many other stuff.
in this request of response, each action, in a web server is under by what we call
middleware, and the pipeline is the trend of execution of this middleware here,
a visual example of the pipeline and all the middleware that are composed.
Each middleware receives as parameter what we call a HTTP context.
It will contain all information about the current request
and the response, everything.
When the first middleware finishes what it needs to do with the HTTP context,
it will call the next middleware.
Each middleware is called in seconds.
To configure the pipeline, we can look at the init function of the server,
not of the module of the server.
So where we create the module.
we will first set a first pipeline, that is a web.
we will create all the necessary middleware and link them together, okay?
So we start with the terminal middleware, because we start by the last one, and
then we create the routing middleware that will be connected to the terminal
middleware, the directory middleware that will be connected to the routing
middleware we just created before.
Same thing for the auth.
authentication middleware and the logger middleware that is the first one.
Then the web module gets the reference only to the first middleware, in
this case the logger middleware.
We can have a look, how middleware work.
So here is a simplified version of the router middleware.
We will check the HTTP context for the request and what verb is used.
If it's a GET request, we will undo it.
Otherwise, we will not manage in our case, so we will just, return
a HTTP response with the code 404.
After, we call the next middleware in the pipeline with a wait next and
passing the context in parameter.
Since we are in the middleware, let's take a moment also to see how, we create an
HTTP response for a resource that exists.
Let's imagine that the request, is for the file index.
html.
The router middleware will look for this file.
is in the main directory, so the www directory for your web server.
And we find it, it will generate an HTTP response with all the necessary metadata.
The metadata is like what is, the meme type, of the resource that you return.
So based on the file extension, is The content length of the body, and of
course the content of the file in the body and the ft TP code that is 200.
Because we find the resources, if we, if it doesn't find the file in
this case, we will again generate a 400 for response in state of.
content of the file.
However, even if the response is ready now, we will not send it yet.
We simply add it to the HTTP context because we don't know if other middleware
need to perform some action on this response before sending it back.
Pipeline and middleware are very common in web services and in this architecture.
he had many advantage, such like modality and visibility.
We will explore that just later.
it is really easy to extend.
So because you can add any middleware to the pipeline without much, at any
place, it's easy to maintain because each meter wear has only one responsibility.
So it's easy to maintain and also to test.
It can also help with the security, because the request must pass
through all the middleware.
It doesn't have choice.
So in our case, for example, we include, authentication middleware.
it's like the second middleware that we call.
like that, we ensure that each request go through it and we know if
the request is authenticated or not.
You can also put a weight limiting if you want a specific middleware for that.
Again, you can add any security layer that you want.
Okay, let's look at another challenge number 3.
how to administrate and operate a modern web server.
So by administrate and operate, like, how I do to change the configuration,
if I want to restart the service, or stop it, if I want to get the
log, of my service, all this action.
For most web server, it's required to connect directly
to the server, so to the host.
and to perform some, b permission, to get this information.
But okay, we are in 2025 now.
and we don't usually manage one web server.
We often manage thousand.
So I want to be able to do everything remotely.
And when I say remotely, I didn't say.
Connecting remotely to the host, I mean to an HTTP endpoint for my application.
I want to operate my web server to HTTP endpoint.
And that's great because we're building a web server, so we already
have everything in the heart, no?
We just need to add some endpoint.
for administration.
So we could do that, but another requirement we want also is to have
an administration services that is independent of our main module,
the web module, to be able to have a better control on the both.
in fact, I want, a transition from my architecture where I have one web
model with the TCP, listening on the port 80 80, I want past two, two model,
one web model and one admin model listening, one on the port, 80 80,
and on the other, on the port 40 40.
Again, the port is just an example.
So each module will listen on its own port and be completely
independent of the other.
That means we can run one without running the other, it will be okay.
So Don't worry, we will reuse what we already done in the first web module.
We will just need to make few adjustments.
First, let's talk about the pipeline.
there will be of course very similar the web, pipeline and the admin pipeline.
But we will replace, the routing middleware that is used here.
To get the source on the server with the handin middleware to handle the
operation like tasks that we want.
So stay in the innate function of the server.
we will create our new pipeline.
Okay.
We can reuse most of the middleware we already had, so the
terminal, middleware, for example.
or the logo even.
But at this time we will have our hand admin module.
Okay.
that's.
Will, start with the admin logger middleware, and this log admin logger
middleware will be, connect of course to the hot middleware that it didn't change.
and the hot middleware will be connect to the directory middleware.
And the directory middleware will not be connected to the water
middleware this time, but to the admin middleware, We train to middleware.
So we have a connection manager, that is a model.
so the connection manager had me this time inside of web.
So we have two module two connection manager, no, with two different pipeline.
let's take a look on the admin middleware.
It is very similar to the water middleware.
a little more long maybe because this time we will not look for five.
Okay, or for resource, we will look for endpoint.
So we have, the main endpoint, like status, reload, restart, logs.
And when we hit this endpoint, we will execute the corresponding
function of our web server.
and return the result of the function on the HTTP response.
For the status, for example, we have the viable status, in our server, so
we will get this value, put it in the body of an HTTP response, and return it.
same thing for the reload.
We will execute the reload function that go with the configuration file
and return that the configuration is reloaded when it's done.
Since authentication is very crucial for the admin section, even if it
was implemented before, let's have a look now how we handle this part.
here we've implemented the simplest authentication method.
Because, remember, it's always good to start simple
before going in complex, area.
So we use basic authentication.
Okay, it's not the most secure option, but it serves our purpose right now.
Okay, it protects the endpoint with an authentification.
What we ask, we didn't ask to use the more secure one.
But of course, in production, you could not use that.
So to make this work, what we do, we have an HTTP request.
And in this HTTP request, We have in the header, if we're
sending an authentication, we will have the header authorization.
And from this header, we can get the username and the password.
We can then check, if it matches with the username and the password
we have in an access file.
in the server.
If yes, the user is authenticate.
If no, is not.
We will, so just had in the context, in the HTTP context, that the user
get Authenticate correctly, for the directory he asked, at least.
And like that, in the next middleware, okay, we have this information,
we don't need to check ourself.
one of the current challenge with administration endpoint,
introduce a sub problem.
How do we provide enough handpon to ensure that the administrator can
perform all necessary tasks without accessing directly to the server?
Okay, so sure, we have some handpon for log, for stopping, restarting the server.
But what if the admin, I don't know, won't check some permission or handle,
other custom scenario, like we do with bash, for example, we cannot
anticipate every possible scenario.
So we need to find a way to offer flexibility to the administrator,
but still to HTTP endpoint.
That's why we have another endpoint called script.
This endpoint can do many stuff.
Either it can list all the scripts present on the machine.
So if you just hit the handpwn script, this is what it will do.
There is a specific, folder, in the server and all the scripts that are
in this server will be displayed.
And if you use this handpwn with what we call a param, a query parameter,
while we will execute the script based on his name and his name is what is a
value passed in the query parameter.
the administrator now will be able to upload any script they want, and later,
execute it via an HTTP request, with of course the result of the script that
will be returned as an HTTP request.
Since the script depends on the platform, we naturally need a different
implementation for both Linux and Windows.
But that is always more issue, like I said at the beginning.
It's just abstraction.
and different implementations depending on the platform.
Okay, now we solved the issue, with, the admin endpoint.
Let's talk about the challenge number four.
logging and effective logging.
logging is crucial for administrators or SREs.
And a good logging is even more important.
of course, our application, okay, need to handle it very well.
since we have different and independent module.
The logger of this module, so the logging system, must reflect also that.
For me, a good log should include at least the following information.
The date, the module, the thread ID, the level, and of course the error message.
okay, to fully understand the issue we have with the login, we need to keep in
mind that The two modules, Web and Admin, are two instances of the same class.
And this instance can call the same function, the logger middleware and,
the logger middleware, when he logs his information, must be able to
distinguish between which module is calling it when he's writing the log.
Since both modules are essentially the same, we need a way to identify
which module is interacting with the logger at any given time.
This means that the logging system should be capable of tracking
the context where it's from.
If it's from the web module, if it's from the admin module, to answering
that the correct information is a log.
To solve this, we must first understand the concept of execution context.
Each part of the code run in its own context.
When the server run, okay, we are in the main function, we are in the main
execution context of the program.
When a new task is created, it forms a tried execution context.
And something important to know, of course, is that when we run the server,
so here is a function runAsync, not of the module, but of the server itself.
the server will start its model in a new task.
When we enter in a tried execution context, it inherit from the parent.
In most of the case, we won't care about this information, but
it's actually one of the solutions we can implement for our model.
We can create a static class named ExecutionContext, which contains a
static variable of type AsyncLocal.
In our case, it's a local AsyncLocal, but it could be any AsyncLocal.
This variable will have a getter and setter, allowing us to
modify the value and to get it.
What is great about this approach is that because the class is static,
it will be accessible everywhere.
But Important, the value his hold will be unique to each executant
context because of the async locale.
Let's take a look, how it works in practice in the module.
So let's come back to the start method.
of one of the modules.
either if it's a web or the admin module anyway, it's the same start
method that is called, so it's okay.
at this point, we will set a new logger, and we will save this
new logger that has the module name, tbkt, in this constructor.
We will save it.
in the executionContext.
logger variable.
from here, this logger value will be specific to the execution
context in which it was set.
All tried execution contexts such as task, or asynchronous operation, with
that context we inherit this logger value.
This answers that each execution context has its own logger instance,
and can log with the appropriate module specific information, without
interfering with other contexts.
What that means is that when we are in our web module, our admin module.
The logger use is always tied to the specific execution context.
This happens because, the, what we set up with the async yokai variable
in the execution context class.
We ensure that each execution context has its own logger.
So when you log a message using execution context.
logger, it will always use the logger that was defined for the current module
execution context, whether you are in the web module or admin module.
This is true, no matter where you are in the code, as long as you are
within the context of that module.
Even if you call a function or use a class that are part of a module.
of another class outside of a module, they will inherit the logger from
the current execution context.
This way, your logs stay consistent and specific to the module from
which they are generated, and so it makes it easier to trace or to debug.
So When you call the, any kind of middleware, like the logger middleware,
and you do execution context.
logger.
loginfo, we know which logger it is because we inherit from the
execution context of the module.
So there is, of course, a lot of other challenges that I didn't mention here.
you can take this, totally like an introduction of how the problem you will
need to solve when you will start to build your own services after, depending what
interests you the most you could consider to investigate performance topic, so like
with the caching or the multiplex request.
maybe more a security feature like the HTTPS because for the
moment everything was in HTTP.
That's why basic authentication was really not secure in our case.
also maybe other, authentication methods.
or if you are more interested, about, what also do we say, we think about
the monitoring of your application, so how you can generate metric,
what new hand point you can provide to get this metric, et cetera.
There is one bonus challenge.
to finish this presentation, I would actually mention it so because it is what
name you will give to your web server.
Of course, you create a web server, you need to give a name is the
most interesting part I think.
it's a really interesting exercise because, it unlocks your creativity
and Myself, I would be really pleased to have some idea you have in mind.
If you need to create your own web server, what name you will give it.
the mine.
Because I build mine, took inspiration of the most.
No, that is AP Apache, and that I really like ponder.
I decide to name it, ponder Apache.
So if you want to explore the code of a web server in C Shop, you can easily
find it on GitHub and the name pastry.
Thank you for your attention and your time.
I hope I gave you interest on this topic, and is the rest of the conference.
Thank you.