Conf42 Golang 2023 - Online

Art of building secure and scalable Webhooks

Video size:

Abstract

Webhook help improves applications’ communication with third-party services but it is one of the most challenging services to build. Using Go one can create lightweight, secure, and scalable webhooks but how? In this talk, I discuss how to secure and build scalable webhooks with Go

Summary

  • Marvin Collins: Webhooks was coined by Jeff Lintney back in 2007. We use webbooks to eliminate the polling process. When using webhooks, you can help conserve resources for client application. Security concerns and security approaches and webhook scalability.
  • Making webhook secure is different from the normal web application security. Webhooks is a URL which is accessible on the Internet. Without such verification, an attacker can fake a request again and send that to that URL. At what point do we start securing our webhook?
  • The next web book security method is verification of token. This simply means there's a secret token that is shared between the client and the provider. Ash based message authentication code is one of the most popular. Another method is just whitelisting IP on both servers for client or provider.
  • The best way to use ash based authentication code with a timestamp. The data that is being shared should be dataless. Webhooks do not require continuous polling for data. But at some point we need to make our business more effective, like more scalable.
  • The number one rule of thumbs when dealing with the book security is authenticate. Number two is encrypt all data, okay, provide less data. Perform logging and tracing for webbooks. Provide documentation to help developers implement your webhooks.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hello everyone, welcome to my session about webhooks security and scalability. My name is Marvin Collins and in this talk I'm going to talk about webhooks. I will start with the introduction and explain what is webbooks. Then we will move forward to discuss the use cases and security concerns and security approaches and webhook scalability. And finally I will give my experience thoughts on out when dealing with webhooks. So from a personal experience, I've done a lot of research developing webhooks application and I'm sure one of you, I'm sure you've, and I'm sure you've had experience dealing with webhooks that in terms whether it's integration, implementation or just adding into a third party system, I will assume that you understand the cops concept of webhooks and I will just try to explain for those who don't know where webhooks is. So Webhook was coined by Jeff Lintney back in 2007. A web book is just a URL again, which is a reverse API that is created by application developer which is referred to the client to receive information from API provider which is often refers server without polling the server. So we'll discuss what polling is. So webhooks basically just another way of communication between application, same as rest API. If you understand what rest API is, and they do use the format which is JSON and request is done through HTTP post request, the same as Restus PI which is can HTTP method. The other option when for example when a pr happened to a GitHub repo, you often receive a notification if your project lead who needs to review that project, or you can listen to the webhook. So this is a good example of a web book because just imagine going to a GitHub page and refreshing over and over. And that is what is called polling. Okay, now polling is the process where you repeatedly send requests to the API which is the server to check for new data or updated data. This is done on at different intervals from the client application to the server to make sure that the client application sync with the server. But one thing you need to know is that polling is resourcing, intensive and inefficient as you can see on the right side of the screen. That is the polling process where a client send a request at intervals to the server. But the story is different with webhooks. So webhooks is kind of don't call me, I will call you and have information and you can see the difference on the screen. So on the left side we have webhooks, and then the right side we have polling. Now, webhooks only send one request to the application, but based on polling side, you can see there's multiple requests happening, some are failing. But when are we supposed to use webhooks? Very simple. Number one, we use webbooks to eliminate the polling process, which I just showed you a few seconds ago. When using webhooks, you can help conserve resources for client application. With the webhooks, there's no constraints building the server. The data is transferred based on event and it's very simple, risk free, with only critical or necessary information. Unlike traditional API relying on webhooks for data and require users to constantly check if the data is there without any trigger events. Webhooks do allow application to transmit data based on events or when the data is available, and they do send it immediately. So another use case of webhooks is automated data transfer on events. So again, I've mentioned event, my previous explanation, but this means webhooks do send data automatically, like immediately there's an event on a resource in the server, the data will be sent in real time. So this make it easy to automate data transfer based on events. Then we have integration and integration. Like previously mentioned, we've built a lot of system and this system do need to widely support each other and communicate and share data. So webhooks allow us to have that implementation with ease. Again, client application can rely on other information, like other system information to create triggers and actions within the application. And in this case, Webhook can help us create those triggers and action on those applications. The second part of this discussion is security concern. By default, webhooks does not come with security implementation and this is a big challenge. Okay. The Webhooks communication mechanism does not have any native way to identify maybe the source in the destination. So this is a security concern when working with the webhooks. Okay, so this means that a Webhook producer has no way whatsoever to verify that it is sending its Webhook data to the right destination. And the webbook's consumer, which is a client, cannot verify that it is receiving Webhook from the expected source. So like Alia mentioned, the vulnerability can allow anyone to act as a Webhook producer, which is the server, or a consumer, which is the receiver. And these people act as the receiver or consumer. The producer can send any kind of compromising data to the receiving application or the client application. But we need to make sure that this system are very secure. And that is where now we explore the security concerns and come up with security approaches to secure the web, webhooks. So I will just recap the security concern again and just explain them. So number one, Webhook's communication mechanism do lack a native way to identify the source and the destination of webhook. That is the major red flag when dealing with the web book. So if your Webhook cannot identify the producer and the consumer, that is a security concern that you need to look at the second one. This means that our Webhooks producer cannot verify that it is sending a web book to the correct extension, and the Webbook's consumer cannot verify that it's receiving its Webhook from the expected source. This vulnerability, again, like I mentioned, can allow anyone to act as a Webhook producer and receiver and potentially send a malicious web book to a web book consumer, thereby compromising and the receiving application. So this is where you get act based on the data that your application is receiving. So how can we make sure that our web, webhooks are secure? So let's discuss that in this section. So before we dive deep into webhook security, making webhook secure is different from the normal web application security. And this is because webhooks is a URL which is accessible on the Internet. It's like publicly available on the Internet as compared to API endpoints or URLs which are secured. Some are public. Therefore, whenever there's a request to, it's the URL that is the Webhook URL. It's very important to ensure that the request truly come from the expected source, as we earlier discuss or mention. Without such verification, an attacker can fake a request again and send that to that URL. But at what point do we start securing our webhook? Okay, so there's on the setup when you're setting up a webhook, and also there's during, the others mostly are done during runtime. So the first one, okay, so we're going to look at one time with one time verification. This is mostly done on setup. So where the provider give the client a token or a one time verification. Remember, just to let you know, this one time verification can be revoked. So the provider will give instruction to the client on the best way to manage this client. So the token will act like a secret key, but it's not managed by the provider, and the provider cannot tell if the client is managing the token. So what they will do on every request they will send a request, a book request, with the security token that they issued to the client. Once a verification token is set and registered, the client will validate that. So it's the job of the client to verify the token on every request. If it matches, the request is accepted. Otherwise the client should ignore and deny that request. The disadvantage of this is the security is very limited because again, like I mentioned, you don't know the best implementation that the client is doing on their end. Okay, so this is also another mostly used way of webhook security. Still, this exposed a lot of security issues to a webhook URL because they can be attacked by DDoS, this can be attacked by server side request forgery and among other security attacks. So it is not the best recommended, but it's being used by companies like Zoom to manage their webhooks. So the next web book security method is verification of token. So this simply means there's a secret token that is shared between the client and the provider. This security code, this secret or verification token, it's sent on every request and on every request. That's very simple. So on request, the provider send a Webhook request containing a secret which is shared between the client and the server in the editors on the request. Editors, of course. Now this security can just be like 64 username and password or something like that, or just a normal security key. Then the client will validate the value on request and compare. If the value that is shared there is the same as the value that they have. Okay, it's also another used web, webhooks of validation and authentication process, but it's not effective. The security method does not address so many things and it does not secure your webhook application as preferred, the most preferred way. The second one is HMAC, which simply means ash based message authentication code. So Ash based message authentication code is one of the most popular, actually it's the most popular security, webhooks security method we use during requests. So it simply has a hash signature in the editors with timestamp enabled for validation. So example of companies that leverage this Webhook security method is GitHub, Shopify, Slack, you name them. So basically the server or the provider will compute a signature and I'm going to display this in a plain test, then send it to the client. Now since the client has a secret, they're going to also compute a signature and compare the two. If they're the same, then that response will be accepted as a valid response. The client application. Of course, after doing that computation and accepting the authenticity of the message request, they will allow to consume that. But now how do we use the timestamp? Now there's a timestamp duration which is allowed for the message to be received and consumed, and if it's elapsed, then that message is considered as irrelevant, so it's not consumed by the client. So that's where the timestamp become of value in this method. Sorry. Yeah. Now, if you compare Hmark and shared secret or verification token, they're more or less the same, but there's more integrity while using hmark compared to shared secret. And also hmark also give you a leeway to deny the token if it reaches a certain duration or a certain amount of time if the message is sent later. Another security method is just whitelisting IP on both servers for client or provider. It's not usually that it's effective because there's IP spooning where the attacker can pretend that they do have the same IP and shared it with the same. So this can be, sorry, IP spoofing, it's a process where the attackers will impersonate the host by just kind of changing or make the IP look like the same as the IP that you requested. So it's not one of the best and not recommended. And also the implementation is a little bit hectic because when the IP change, that means you have to do the setup again. We have mutual TLS, which is one of the best when it comes to webhook security method. So whenever, let's say you are sending a URL, sorry, you are sending a request from one services to another. There's what is called transport layer security and shake protocol. Then the server will send a certificate from client and the client will verify that certificate is coming from the server that is sending the request to them. With the mutual TLS, not only does the client that, not only the client will verify the server, but even the server will verify the authenticity of the client, so they both verify each other. This method is very secure and used by big companies like docusign, but most of the time it's very difficult to maintain since one of the biggest challenges that the certificate can expire, the certificate can be changed, they can have a different the certificate can be revoked, and that means you have to set it all up again most of the time. So that is the downside and it's not mostly the best way to manage a very high demand webhooks service. So from all this example that we've looked through from let's start with the one time verification process, verification token and shared secret ash based message authentication and to IP white listing to mutual TLS. What is the best approach to implement security? Webhook security. Now again, it's very debatable. I will say it's very debatable and hear me out. The reason why it's very debatable because the security of your web book depend on the data that you're supposed to share with the to share with the security of your web book depend on the data that you're supposed to share with the client or the data you're supposed to receive as a client. So if it is just an average data that does not expose a lot of things, then the best way to use the HMAC, that is ash based authentication code with a timestamp and also the data that is being shared should be dataless. It's supposed to contain meaningless data, supposed to contain the minimal data. And this means that whenever the client receives that communication, they can do again one polling or they can retrieve the resource that they need through the API. So our webhook will just notify, will just act as event to the client and create can action. And that action is now what you're going to use to complete a resource on the server. So that is the best approach that I think. And just to show you this in Golan code, let's open vs code real quick. I have this ready here. So you can see here we get a signature here. So let's look at this function. And this function is just getting the data. So this is the data, the plain text that we are supposed to send to the client. And we get a security which is shared with the server and the client. And we generate a new ash using the given ash type and key. The ash type here is now the computing algorithm is the Shawan and the secret. Then we just write the data and we return with the encoding format which is exam to string. So this is our signature that we return. And that signature is going to be included in the request editor, as you can see now, when it reaches the client side, when it reaches the client side, the client side will use this key that they shared with us. They will use this key and they will take the request body which is just the data that this body here, you can see we have that body here. Yeah, we get this body, JSON body. So they will use this secret key in the body and try to create a signature and match the signature. If the signature is the same then they will confirm that the data is valid. Now winding up, winding up, now winding up, winding up. So the base is just having ash base with the timestamp, the one that I showed you with the less data and when it reaches the client side then they will do a server request based on the action provided in the webhook event. So webhooks scalability. Webhook scalability, let's look at that real quick. So we've talked about all this implementation and setup and everything, but now our application is serving a lot of users and we need to scale this and that means our webhook should be able to handle large volumes of data with ease and making them highly scalable available for transfer between our application and other application. So as a webhooks, webhooks do not require continuous polling for data. It's much more efficient where resources and resource friendly and it sends the data in real time. But again at some point we need to make our business more effective, like more scalable based on demand. Okay, so webhook in conjunction with other solution like your infrastructure, et cetera can be used to make it so scalable. So let's look at some of the ways that we can make our webhook scalable. Number one is you need to optimize your Webhook payload. This is very simple. This is just to ensure that your webhook payload, payload is the request data that is being sent is very minimal. It contain only necessary data as minimal as possible. So this will reduce the amount of data that you are sending. Okay number two, implement load balancing. So using a load balancer technique we can distribute the traffic and the workload to multiple servers or multiple services. This will prevent any just one services being overwhelmed with the large volume request. So that is very important. Use a message broker. Don't just directly send the content to your message, to your web book, to your web service. We can use a message broker. Example of a message broker is Nat. We've implemented NaT and lambda is Rabbitmko and Kafka. So using a message broker to handle requests and process distribution, this will help reduce latency and improve scalability by number, a huge percentage because now the data is sent to the message broker and the web book services is just going to pick it from the message broker and distribute them. Implement cache this is a very good one, implementing cache. So when you implement cache it will improves the frequency of accessing data. So the common data that doesn't change so much often can be put in the cache and if they needed to be included in our webbook request, then it will reduce the number of requests that we need the transaction. Let's say we need to get this data from the database or other services. So when you put this common data in like cache, it will reduce the number, the transaction or the frequency of accessing that data by a significant percentage. So I know this is not common when dealing with webhooks, but I recommend it and it will improve your scalability when dealing with webhooks. And finally monitoring. So if you don't monitor your webhooks, you will never know the bot length of your webhooks infrastructure and setup. So monitoring will identify webbook scalability issues and you can use those metrics and logins to trace the response times, the errors and key performance metrics within your application. So those are the key things that you need to do to scale your application. But as I wind up this talk, as I wind up this talk, I have some items that I want to reiterate or just mention them one more time. So number one rule of thumbs when dealing with the book security is authenticate. Authenticate, that is, verify the source, verify the consumer and using the authentication method that we've mentioned before during this talk to verify the source and also the consumer. Number two is encrypt all data, okay, provide less data, encrypt all data if necessary. That will make it very easy for you to secure the data that is passing through to the client from the server side. And again, I will repeat this, use times time to prevent replay attack where attacker can replay the message so many times, okay? Provide sdks for the user so that they can know how to implement the webbook. Again, provide documentation, very good documentation and listing the best way to implement webbook security. That will help a lot when developers are trying to implement your webhooks. Perform logs. So webhooks are part of event driven architecture, okay. With this event, you should be able to trace a user through the system from account creation like the way you just do, from account creation to whatever, et cetera. The same thing should happen with webhooks. You should perform logging and tracing for webbooks and that will give you a clear picture of your webhooks. And finally, please provide webhook events id so that you can track which web book to a specific point in time and also the origin of that web book. Those were my party shots and I want to thank you. Santsana my name is Marvin Collins. My Twitter handle is at Marvin Collins.
...

Marvin Collins Hosea

Founder @ AppsLab KE

Marvin Collins Hosea's LinkedIn account Marvin Collins Hosea's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways