Conf42 JavaScript 2022 - Online

Implementing a performant URL parser from scratch

Video size:

Abstract

Performance plays and important role in engineering. After looking for contributing to Node.js, implementing a very fast specification compliant URL state machine and a parser for Node.js became the project which pushes the boundaries of both javascript user-land and node.js.

I’ve been working on implementing the URL parser, currently written in C++, in Rust and WebAssembly. I want to share my experience working on it, and benchmarking with the native implementation. The first talk will cover implementing it in Rust and WebAssembly, which I wrote it on https://www.yagiz.co/implementing-node-js-url-parser-in-webassembly-with-rust/, and then I’ll focus on implementing it using JavaScript and releasing it under github.com/anonrig/url-js.

Summary

  • Close today I'm going to be talking about implementing a performance URL parser from scratch. Nizipli is the creator of URL state machine and fast query string. Real time feedback into the behavior of your distributed systems.
  • Every single network request uses URL parser, and it basically creates the baseline of benchmarks with any network request. URL parsers are a very significant component, but it's too complex and leads to bugs.
  • Can we write a fast URL parses using Javascript? Dipli: Today I'm going to talk about the road fast URL parser for nodejs. URL state machine mpm package rise is available through GitHub.
  • The JavaScript based parser was 566 66% faster compared to the other JavaScript implementation. If you want to make things faster, you need to find shortcuts. The current implementation is 99.9 perfect spec compliant.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Jamaica real time feedback into the behavior of your distributed systems and observing changes exceptions errors in real time allows you to not only experiment with confidence, but respond instantly to get things working again. Close today I'm going to be talking about implementing a performance URL parser from scratch. So before we start, I want to give some a little bit of information about me I'm yold Nizipli. I'm a software engineer. I'm a master's student at Fordham University. I'm a node core collaborator and performance team member. Ive been mainly focusing on text encoders and decoders and the performant in nodeshifts. I'm the creator of URL state machine and fast query string. So let's give a little bit of a background before we start into what URL parses are and how that goes. So a little bit of brief introduction to JavaScript runtimes. So nodejs is a runtime on top of v eight engine, the JavaScript engine behind chromium. Just like Nodejs, Dino is also using v eight and a little bit of new runtime bound GS using webkit as a runtime. Basically in JavaScript runtimes is that whenever you call a function, the javascript under lines user land runs a c plus plus code using a bridge through node js. And the value that you retrieve from the C Plus plus is deserialized on the JavaScript side and then present to the user, which is the developer so computationally has a function living in C plays plus. This means that you don't need to focus any of your code in JavaScript, but you can write it in c if you want to produce performant code. So let's start with URL parsers. Why URL parser matters? So URL parsers are the most important functionality for a web server. Every single network request uses URL parser, and it basically creates the baseline of benchmarks with any network request. So this is an example of a diagram of a request coming to a server. So it's basically on the left side there's this person, let's say Alice or Bob. Bob submits a credential through user agent which sends to the client. The client sends a request, submits the credentials client id, redirect URL to authorization server authorization, returns authorization code, and then user agents like redirects authorization code and client request token and so on and so forth. So basically every arrow from left to right includes a URL, and for every left to right request it needs to parse and understand what that URL is and what the path is. So that's where URL parses come in. So URL parser is a very significant component, but it's too complex and leads to bugs and that's why nobody focuses and nobody wants to focus on it. So this is one of the examples while I was working on this project is that I thought that I found a bug and it seems that like 256-25-6256 URL is invalid. And one of the libraries what we URL recommended that actually it worked in Chrome and safari, but not in Node Js. So this means that the initial implementing of a URL parser was somehow wrong or not up to date. So after this feedback that I get, I opened an issue like URL parsers does not relate to IP before with more than four parts. This is one of the issues, URL is not compliant with specification. This is another issue created by me and later they're all fixed. But let's talk about the goal of a URL parser. What is the goal? Basically, if you look into this diagram, you will see on the orange side that HTTPs userpassample.com 80 80. So basically HTTPs is the protocol. Username password is an optional value with add characters and then hostname is always there. The port, if it's another special port which is like 80 or four, four three. It's not required because URL parsers understand that. And then we might need some path name and search and hash. So if you look at from this diagram, you can easily say that oh, okay, it's really easy to understand and implement URL parse, right? You just need to get, if it starts with HTTPs, if there's a user and path, what about hostnames? What about ports? What about path name? We want to make sure that the search is a valid one and so on and so forth. But it's not as easy as writing a regex and it's much more complicated. So what is the state machine behind it? There is actually a specification for writing URL parsers and this is an example of it. This represents all of the state machines that needs to traverse in order to complete the URL parser. So if you look at from a much more closer perspective, there is a schema start state scheme. Start state can go to no schema state. Let's say we don't have HTTP HTTPs. If it starts, then it's schema state. Then we need to parse that schema. What is that valid one? What are the valid ones. If it's HTTPs then we might omit the port part because HTTPs is always four four, three. If it's HTTP then it's 80 if it's SSH. So there are some special cases and according to schema state it can go to special authority estate, special relative authority state or relative state. So special authority estate, which means that either I have a username password kind of combination, so I need to parse that state. If it's a special relative or authority state, it can be a file path. In windows it's something else, in Mac OS Unix systems it's something else. And according to that, it can be a relative state. It can be a relative state. So it's not as easy as writing regex or userland. So this is one of the states that are available in the URL parsing specification. So let's dive into scheme start state. So if c sees a character is an ASG alpha append c lowercase to buffer and set the state to schema state. So this is pretty straightforward. Otherwise, if state override is not given, set state to no scheme state and decrease pointer by one, otherwise validationary return further. So if you look at from this perspective, it's pretty straightforward, right? But this is one of the, I think 20 states that are available in this specification. So let's look into a little bit more complex efforts. No scheme state. What happens if I don't have a HTTP, HTTPs and so on and so forth? If base is null or base has an OPAC path and c is not u, which is like hashtag validation error return failure. So first of all we need to know what is an OPAC path? It's all explained in the documentation. Otherwise, if base has an OpAC path and sees an hashtag set URL scheme to the base scheme, URL path to path URL query base and URL fragment to the empty stream. So if my path starts with a hashtag, which means that this is a fragment state, so the URL needs to go to the fragment state and that's one of the end states, otherwise blah. So this is what URL specification URL part of specification is all about. I kind of hear, why are we even talking about it? So the main issue that actually led me to this implementation is that there's a significant performance issue with what Wegrl in node js, and in order to update it there was an issue opened and I wanted to make it faster. So can it be faster? So this is one of the issues re implement was VGrL parser and Wassem. The URL is there for you if you want to look. And it was actually opened by one of the technical steering committee members, which is James Nell. He basically said that it should be possible to realize a significant performant improvement by porting the implementation to WASM. So wasm is webassembly. So if we implementing the URL parser through WASM, then we might get a URL parser. So this was pretty interesting to me, because right now I know what I needed to do. I need to port the implementation in a language that's performant like C or rust, and then I need to expose the API through webassembly so that node js applications can consume it. But before we start, let's take a step back and understand what our options are. Webassembly is really webassembly our only option. The potential solutions, not for writing URL parses, but also for most of the application, is that you can either compile C, rust or any other language through webassembly and expose the API. Or you can write an API which is node API in C rust, and you can actually run the C code and expose it through the c binding. Or you can write a pure Javascript. So we don't want to write and expose our implementation through C and rust through any API, because that's the actual implementing what's happening right now. So let's start with webassembly then. So I chose rust because it's proven that rust is super performance. Actually, I wanted to learn rust. So this is one of the emotional decisions that led to this decision making process. So webassembly and rust is a really good job thanks to Firefox. There are lots of contributions on that side. There are really great libraries, and the ecosystem is ready for mass adoption. And one of the most crucial parts in nodejs, and on top of that there was umdichi. Umdichi is a really good example. It's backed by node and actually by using webassembly and compiling LLA HTTP to webassembly, they had a 1329% performance boost and it was really good, and it was the basis of webassembly. But these are the expectations. But the reality is decoding UTF 16 to UDF eight. Webassembly communication relies on shared memory. Okay, there's some problems with deserialization, and deserialization of that data. Shared memory requires decoding and encoding. So text encoding and decoding in nodejs relies on c plays plus, which means that we're going to use text encoder and text decoder. So why did we end up with text nodejs? So a quick recap. So the goal of this presentation was that we are trying to improve the existing implementing improving by removing the c bridge, but text encoding and decoding in node js relied on c. So let's do a benchmark about all of those things. If you use a webassembly and if you use rust and expose it through webassembly, this is again roughly estimation. This is not a full feature set URL parses that's written in rust, but the rust implementing is 28 24% slower. For URL constructor, this is just by passing the URL and returning an object. That's it. On the other side, URL search param set is 86% slower because whenever you are calling set, get those kind of variables. There's a c plus plus to JavaScript communication, which is we are using webassembly but there's always a bridge between it and for fm it's 96% slower. The main reason for URL search params is slobber is that because it's actually implemented in JavaScript and nodejs. But I wanted to make sure that if it's possible to also improve the URL search parameters implementation too. So it means that it's 86% slower. That's a fact, not a random number. And here we are again, we are starting this presentation from scratch. Can we write a fast URL parser using Javascript? So today I'm going to talk about the road fast URL parser for node js. My name is Dipli and let's start from scratch again. So why Javascript? So if you know, there was three main reasons that we thought what are the three alternatives? We could either use c plays plus and rot through webassembly. We could either use Nodejs API with c, or we could just write any javascript. So existing implementation in C we tried webassembly and did some benchmarking, but it was slow. So there are no other solutions that does not use a c bridge. So I'm kind of limited by the technology of my time as Howard Starkin. So as a result, URL state machine mpm package rise. It's available through GitHub on anonyric URL, which I will include at the end of this presentation. But it basically is a super fast compliant URL state machine for nodejs that's written in JavaScript, but not as fast as I would expect it to be. So this is one of the packages, the usage example. So if you just pass the string of the URL, then it actually returns an object which is scheme username password, and you can retrieve those information. So let's do some benchmarks about the JavaScript implementation of it. There's another library called what we G URL, which provides the same feature set, and it's around 576% slower. URL state machines is yes, five, six times faster, I guess. And the actual URL implementation provided by node js is that it's 85% faster than my solution. So this means that the decoding and encoding side of it, and it's slow, and JavaScript is kind of slow for these particular operations. So there are some of optimizations that I did, and I'm going to talk about those things. So using function versus class, actually up until nodejs 18, up until v eight 9.59.7 version, if you create a class and have private variables on top of it, it was actually 55% slower. And just by converting a class to function, a bit like function prototype kind of keywords, it makes your code 55% faster. So this is one of the first optimizations that I did. The second thing was if you change your conditional function to switch, or if it actually improves your quota. So if you have more than ten keys, if you're doing an if else statement, actually converting it to switch is faster. If you have less than ten, then if is faster. So again, it depends on how your implementation is going, and you need to choose correctly and do some micro benchmarks in order to understand. But again, if you're using a new node version, these dynamics can change because there are micro benchmarking that are based on v eight. So there's another one that like string to number comparison. This is really important because if you want to compare, let's say string, how would you compare two different strings? First you would check for if their lengths are equal. If their lengths are equal, then you would traverse every character one by one and make sure that each character is equal to each other. This means that I need to access all of these compared strings, each character one time, and then make an equal operation and then check it. So this is one of the idiomatic implementations that you need to do. So just by changing my alternative, like my states from strings to numbers, which means that this is one of the changes that I did. The authority 100 schemes start 101, I had a really huge performance. So this is one of the changes that I did, again, really important. One of the other thing is that with slice versus substring. Slice creates a new string, substring does not, but actually slice is faster. This is one of the examples. Instead of having substring this point to plus one and this input length, I just change it with a slice and calculation of fragment is a lot faster. On top of that is none. So defining if it's not a number versus if it's checking for undefined is actually really costly operation. This is one of the mistakes that I did. Not a mistake, but it was mentioned directly in the URL parser specification and I wanted to be 100% compliant to it. So in order to make it faster. Right now I'm starting to divert from the specification and I'm trying to introduce optimizations of it. So this is what happens with all of those runtimes. If you want to make things faster, you need to find shortcuts. And those shortcuts are not always means that you are going to follow specification. And then there might be some uncode issues that might occur. And because you are not following the specification, it will be also super hard for you to track. So yeah, instead of saying core point and accessing grid, I just checked it for if it's undefined, undefined. If it's not, just get the core point at character zero. And this is also a really huge boost. Instead of saying is none, then I'm just checking for if it's undefined again. So one of the again, most important parts of optimization is memorization. If you're recalculating a variable every time, if you're doing comparison many times, then if you store that variable in the memory and then access to that variable in the function callback, then again super fast. So this is one of the optimizations that I did. So the URL specification actually states that you need to check for if the current URL scheme is a special schemes state. So previously I was just checking if it's in an object of special schemes, then I would assume that it was a special URL. But the thing is, you just need to set these if it's a special URL or not, when you are changing the URL scheme, then it's actually a lot faster. So let's do some statistics. So because of the work that I did, there's almost 2000 lines of code was written, more than 800 benchmarks was run, and the current implementation is 99.9 perfect spec compliant, that's 0.1%. That one test is related to text encoders and like encoding UTF eight and UTF 16. But it again can be completed 100% really easily. There were 733 successful web platform tests that was run in under to make sure that our implementing is correct. And even though I use all of the I tested all of those WPT tests, it was actually 96% covered in the tests. So this means that the URL specification does not have the necessary tests to cover 100% of the code that was written. And I think this should be handled on the web platform test. So performance in terms of performance, the JavaScript based parser was 566 66% faster compared to the other JavaScript implementation, but again it was 95 and 96% slower than the native C plus plays implementation. So what basically is that there's a really great guy from Brooklyn that said that great power comes with great responsibility after v eight fixes this performance problem. So we are really dependent on v eight on performance, and I hope we will have some breakthroughs in the future. So a small footnote, this is a product, this, this whole presentation, this whole project was a product of my graduation thesis, Fordham University. And so I would like to thank William Lord from Fordham University, as well as James Nell and Mario Colina from Nodejs and thank you for that. And thank you for listening.
...

Yagiz Nizipli

Node.js Core Member, Senior Software Engineer

Yagiz Nizipli's LinkedIn account Yagiz Nizipli's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways