Conf42 JavaScript 2021 - Online

The Art & Science of AB Test Development

Video size:


Over 90% of your website visitors do not buy. Because of this, conversion rate optimization has become a core practice for almost every organization. But out-of-the-box solutions will inevitably fail to achieve significant results without the critical expertise of a front end developer. Learn how to optimize your user experience with AB testing from a developer point of view and how to champion your company’s experimentation program with smart practices, process, and coding strategy.

In this talk, I’ll chat about:

Intro - Introduction to AB Testing - Popular platforms used for testing - Team members and expertise needed for an experimentation program

Setup - Platform installation - Forming a hypothesis - Creating a test plan

Experiment Walkthrough - Using Optimizely as example - Targeting, Variations, Audiences, Metrics, Traffic allocation - Previewing experiments - Brief experiment code exploration

Dev Strategy - In house vs 3rd party development - Setting up experiments, launching, and what results look like - AB test coding tips and advice - How to code for SPA websites - Debugging


  • Bill Code talks about the art and science of a b test development. Currently works as a front end optimization developer at Love Every. Offers stage based play kits and digital products for early learning.
  • A B testing is a method of showing several different variations of a web page or application of visitors at random, and comparing which variant converts better. With experiments, you'll need some basic roles at your organization.
  • With client side testing you will want to implement your main platform snippet as high up in the document as possible. Every A B test has a duration, which is the length of time the test should run until a statistical significance is reached. Having this test plan in place makes it really easy for you as a developer to understand exactly what you need to code for this a B test.
  • optimizely has a really easy way to create audiences with these sort of drop downs. This is anywhere we want to add certain goals, such as clickables or visitors to pages. You really want something like a build tool so that all of these files can be separated and compiled into a single build.
  • Software development has leaned heavily on architecture analogies. Experimentation, however, is the art of building sandcastles. What we're doing here is we want to make quick, iterative experiments. velocity is key.
  • A B test development is a separate work stream from your regular sites development. Avoid concurrent tests on the same page. Small iterative changes are usually best. Having enough traffic for your test to reach a statistical significance is important.


This transcript was autogenerated. To make changes, submit a PR.
Hello everyone. My name is Bill Code and my talk is the art and science of a b test development. What I'll be talking about is going to be a brief introduction of who I am, an introduction to the A B testing process, a b test developer strategies, some developer tips, what happens post launch and how to iterate, and some parting advice. So so currently I am a front end optimization developer at Love Every, which is a company that offers stage based play kits and digital products for early learning. Prior to this I worked at several different conversion rate optimization agencies as a developer. Some things that I like. I am a very amateur home cook, big fan of electronic music and getting into student filmmaking. I love video games and also a huge hockey fan. So that's enough about me. Let's talk about what a b testing actually is. In its simplest form, it's a method of showing several different variations of a web page or application of visitors at random, and comparing which variant converts better. We also call these test experiments. You may also know this process as split creating. It's the same thing. We're actually splitting traffic of a website between the different variations that we create. And in this example here, we have sort of one conversion of a website, which is the control, or how the website exists in its natural state, versus a variation. One where we've changed that header section to a different design. And then through the use of an A B testing platform, we will launch this experience, split the traffic, and then measure the difference between these variations. So with experiments, you'll need some basic roles at your organization. The first would be your lead optimizer. This is someone who's actually the point person for the A B testing efforts and is managing a testing roadmap. You'll of course need a UX designer to research the problems or opportunities to test into. And this person is designing those new experiences for your developer who's actually creating and deploying the code. And lastly, you'll need a digital analyst to interpret the results and build recommendations based on those test outcomes. So we see that optimization experimentation, it's really a collective effort. And after sharing more about the A B testing process, I'll of course be explaining the intricacies of the developer role here. So what platform do we want to a B test with? Well, there's a lot of them, and you can see here this long list, and honestly, at their core they all do the same thing. They all have their pros and cons. But for the purpose of this talk, we'll be looking at some screenshots, optimizely because in my opinion they probably have the best UI and are the most straightforward. So we'll be seeing screenshots from optimize in this case. But there are plenty of tools for a b testing out there. So with client side testing you will want to implement your main platform snippet as high up in the document as possible and ensure it's synchronous. The reason for this is that you don't want your users to see a glimpse of original content and if they're bucketed intro a different variation, they'll see that original content change walkthrough their variation content. And this is what's known as Flickr and it degrades the user experience. So it's very important that this script is implemented synchronously. And yes this is render blocking, but we are changing the way chat the page is rendering here with this script, so it is warranted. But because of this, we want to make sure that this script, whatever code we're deploying for our variations in our AP test, we want to make sure that that code is as performant and as small as possible. Also, a mistake that beginners make here is that they deploy this snippet through a tag manager which automatically makes it async. I think really the only exception for doing that is if you are only a b testing elements that are below the visual fold. But in general it's not recommended. So for this talk I have a website that has optimizedly implemented so we could actually take a look at what's going on. So if we inspect with devtools we could see in the head that we have our snippet right here and then in the console. This gives us a global optimizely object we could play around with and it shows us data on the experiments that we're running. But let's switch back over to the slides for a second. So on your ecommerce website, your team may have discovered a problem or opportunity that's been validated with data to test with a b testing. And here we have an example website that's selling some apparel and we have a hypothesis statement stated here. I believe if we add a mini cart to the header we'll make it easier for users to check out. If I'm right we'll see an increase in order conversion rate. So we want to create an A B test where we're showing a mini cart on this page right now. It's just a sort of static cart icon on the top right? But what happens if we add a little mini cart. When we hover over, we see an order summary. Will that perform better than this current control version? So once you've come up with that hypothesis, you'll want to formalize things into a requirements document. This is an example here where we have experiment one. Add mini card as the title of our A B test, and we'll restate that hypothesis. We'll show the problem that you're addressing. We'll list the device that the A B test should run, where it should run, what key metrics you're tracking. Also, every A B test has a duration, which is the length of time the test should run until a statistical significance is reached and the requirements, the visual requirements. So on the left is our control, the way the page exists in its natural state versus our variation one. And you can see that we're coding that little mini cart on the cart icon, hover. We'll list some dev specs here for the developer so they know exactly what they're doing. There will be some user QA stories here so that your QA personnel can figure out exactly what needs to be done on the new variation and some key metrics to track. So having this test plan in place makes it really easy for you as a developer to understand exactly what you need to code for this a B test. And when it comes to the way that things execute, they generally work the same across the platforms. But essentially the A B testing snippet will load. It will check if the user is on a URL that is targeted within the A B test, and if so, it will then check if the user is in a certain audience that you've set up for it. So something like the browser or desktop device, if they're in that audience segment that you've set up within the test, they'll then move on to check if you're in the proper traffic allocation for the experiment. Because not all experiments have to be set to 100% of traffic. Some experiments, maybe to mitigate some risk, can only be set to maybe 10% of traffic. And within that 10%, there will be a 50 50 split between v zero and v one. So if you're in that test group of 10% of traffic allocation, the platform will determine which variations you should see. And then any experiment code that you've written, let's say it's v one, then that v one experiment code that you've written will then execute for that user. So that's the order of events. Now, when it comes to dev strategy, there's a number of ways to code these A B tests, but it always depends on the requirements, so we'll take a look at those now. So coming up with the dev strategy is going to depend on if you are an in house developer or a third party developer. If you're an in house developer, you'll have source code access, which will make your job a lot easier. In this case, it's possible to do hide show tests where, let's say you're creating the addition of a new component somewhere. You can build that new component into your code base in a hidden state. And then in your v one of optimizely just put a CSS rule to show that new component there is total control over deployment. So you know exactly when to deploy your a b test code. You can sync it up with your normal site deploys and also you have the ability to do more redirect type tests. So if you're testing a completely different redesign of, let's say, a product detail page, you can code that new product detail page on a completely different URL. And then within optimizely you just can do a redirect to that new page. If you're working as a third party developer, you don't have access to the source code usually. So all changes have to be made based on what you see with the site in front of you. A lot of times you're searching for global functionality that you can manipulate and it's a little more risky because there's unknown context. You may not know if the client website is going to deploy a change that is going to remove some dependencies that your A B test code was using and that will break your a B test. So really there's a big difference between the two and that's going to determine your strategy of how you actually code these A B tests. In this talk, I'm going to focus a little bit more on the third party dev aspect of it just because there's a lot more challenge to it. And I think it makes for a much more interesting talk than simply coding up an alternative version of a page and then doing a redirect test too. So moving on now we'll take a quick look at the setup within optimizely for the A B test. If we go to our A B test in optimizely, we'll see that we have a spot for variations. We're just going to have our original and our variation one here. The targeting is where the A B test should run. So if we're referencing our test plan this a b test is going to run sitewide because this little cart icon where we're adding the mini cart is available sitewide. So we would do a substring match for The audience referencing our test plan again will be desktop. So optimizely has a really easy way to create audiences with these sort of drop downs you just drag and drop. You could do a number of different audience solutions, the platform, the location, there's a whole array of these things, but in our case we just need desktop. We'll jump now to the metrics. This is anywhere we want to add certain goals, such as clickables or visitors to pages. You could create custom goals that can be done in this section. Shared code. So any code that should run before any variation code should run. Usually this is maybe some sort of bucketing code to Google Analytics or Adobe site catalyst, whatever analytics platform the site is using and the traffic allocation. So this is the portion of visitors meeting the audience conditions that are eligible for the experiment. So we'll keep it at 100% at a 50 50 split. And then there's just a number of other options. You can schedule the test to go live at a certain point. It also gives you some API names to make your code a little bit more dynamic. But we'll jump now to actually the coding strategy, how to actually code this test. So we'll go into one of these variations, our v one, and it has a WYSIWYG editor, but we're never going to use that. We're going to write actual code here. There's just certain things you can't do with the WYSIWYG. But if we go into this editor here, we see that there's some code here. And next we're going to walk through some of that code. So technically you could write all of your code into a single file and then copy and paste it into the optimize Lee editor. But you'll see that it sort of becomes unwieldy the larger the experiment is. But just as an example, it might look something like this, where you have an iffy, so has to not pollute the global namespace. And then you can add your CSS styles by way of string concatenation, and then you can start adding your HTML with functions to create that mini cart and function to save to local storage so that that data persists across the different pages and the event listeners. So as you can see, this isn't a huge experiment, but the bigger ones you really want something like a build tool so that all of these files and concerns can be separated and can be compiled into a single build, which I'll show next. So now we'll look at a build tool and a defined structure for our a b test code, and you'll see that this is a much more preferred way of doing development. So essentially this is a webpack config file where we could use all of the latest and greatest bundling options we could use has we can import HTML, we can minify, we can transpile, we could use node packages, do all those fancy things, which is really nice. So if we look at our entry file, which is our v one js, we see that we can import our CSS, attach it directly to the head this way import those functions. It's a lot cleaner, a lot nicer than that single file build. And if we pull up the terminal and we run the webpack command, it'll run some processes, and when it's ready it will spit out a single build for us. Our v one bundle here and now we can actually just copy and paste this directly into optimizely. So at this point we want to test our code just before we paste intro optimizely. And this is a single page app. So even if we go to another page, we're still within the same context. And this code has watchers to check if any other items are being added. And it's checking local storage and then creating that HTML list when we're hovering over the cart icon. And if we go to another page, we'll add another product, we'll hover over. We see that populate in the mini cart, which is great. Of course, if we refresh, we won't see that mini cart on hover anymore, we'd have to reenter our code into the console. So what we want to do now is paste this code in the optimizely editor so that optimizely can give us a preview link so that we can have a persistent experience. So if we go back over to optimizely, we'll go back to our variations, we'll hit edit, we'll go to the code editor, and we'll paste in our build, hit save, and apply. And now to create a preview link, we go to API names and we want to copy the id of the variation. And we can use this in a special parameter. Optimize Lee X equals chat id. And then optimizely Lee token is going to equal public. And so with this URL, you can essentially share the experience with your code running to any of the stakeholders, UX designers, QA to make sure the experience is looking and functioning as expected. So if we add those query parameters, add a product to the cart, we see that the experience is showing. And so we'll go back to the home page with those params, hover over the cart. We still see the mini cart, so this way we can share this experience with anyone. Now, once your experiment preview has been qaed and approved, it's time to set the test live. And after the test is live, the platform will continue to collect data and randomly bucket those users to either v zero, V one, V two, and so on. And those users will continue to see their experience until they clear their cookies. And this whole time the experiment platform is collecting results. Now this is typically more of an optimizer or analyst responsibility, but this is what it would look like. And after the test has run, after it's reached statistical significance. Now what if your test lost? Well, it's not all in vain. You've probably learned something and you've probably saved time and money by not permanently implementing something. Chat wasn't going to work, and oftentimes negative test results often lead to new creating ideas that you can continue to iterate upon. And you really do need to have that culture of experiments at your company and that trust and that ownership to continue doing this until you find a winner. Now if your test won, that's great. You can temporarily set and optimize the v one experience to 100% of visitors. Now, you only want to do this for a short amount of time. It's just generally not good to have what are supposed to be these somewhat ephemeral tests running for such a long time. You want to immediately get this into your main core development team's roadmap so that it can be a permanent change. So now I'd like to share a quote from a friend and former coworker of mine. His name is Aaron Montana and I think this beautifully captures the essence of a B test development. Historically, software development has leaned heavily on architecture analogies. Strong foundations build good houses, maintenance is vital, etc. Experimentation, however, is the art of building sandcastles. Beautiful structures, but complete facades intended to delight, facilitate learning, and be washed away by the tide. It's so poetic, but it's so true. I mean, what we're doing here is we want to make quick, iterative experiments. So velocity is key, because what we care about are those insights so we can make decisions on things we want to productionize or things that just don't work. So thanks Aaron for this. So now I'll share some general developer tips for a B test development. The first is avoid concurrent tests on the same page. Not only will this reduce complexity, but if you have many A B tests running on the same page, it can also be hard to attribute results because you're not sure if the results you're seeing are because of a combination of variations in experiments on the same page. So in general, to reduce complexity, just keep it to a few per page. Also, namespace classes, ids, global variations a b test development is a separate work stream from your regular sites development. So anything you can do to reduce any kind of conflict there is important also always QA on actual devices. We know the actual devices can perform differently than the emulator, so that's important. Check logged in logged, but states when you're developing the code, anything dynamic should be carefully considered and accounted for in your code. Confirm the control state when bugs are found. So if there's any bug on your site, stakeholders may not know where it's coming from. And the first thing you might want to do is opt out of your A B testing platform and see if the bug still exists. If it is, it's not coming from your A B test. That's usually a good first step. Here are some mistakes that beginners make. Know writing CSS in line as we saw before, we have modern tooling now to create has, and we can compile that into a single build that we can add. And a lot of the platforms have a separate space for CSS, so we should take advantage of that. Anything to keep our code clean and organized. Also, here's a few tools that I think are particularly important when it comes to single page applications where you're sort of seeing those virtual page changes and components are rerendering. So your code really has to happen at those certain triggers. And mutation observers are great for that. You can attach your mutation observer to a certain section, certain element, watch for the subtree, and then when there's a mutation you can run a callback intersection observers are great for when you want to detect if an element has entered your viewport. There is a technique for XHR override, which you're basically overriding the prototype of the XHR request and then putting in your custom callback when that request is completed. And we could also use pulling functions to check for a custom condition and then run a custom callback. Here are some query parameters. These are specific to the optimizely platform, but you can log what audience segments you're in, any warnings or errors. We have query parameter to disable or opt out of optimizely altogether. And as we saw before, we have a query parameter to preview experiences in optimizely. They also have a great debugging extension as well. This is great for your non developers to see what's going on as well. So really a B test. It's a communal activity and everyone across your organization has great ideas and different perspectives and different sources of data. So I encourage you to have a way for those people to submit their ideas. One way you could do this really simple, is a Google form. Ask them if they have any ideas, and then you can create a roadmap and prioritize them and figure out what would work best for your organization. Lastly, some parting advice. Definitely keep a hypothesis library, some centralized repository where you're keeping track of all the ideas that you want to test and the outcome, the development effort that goes into it, and the priority. Small iterative changes are usually best. If you're doing a test where you're testing a completely different product page design, maybe you're adding certain features to the page. Well, even if that a B test has a positive result, you may not know what specifically caused that. So if you test small changes, you'll know if there was a positive or negative result. They could be directly attributable to that one change. Having enough traffic is important. You need a certain level of traffic for your test to reach a statistical significance. So keep that in mind and always be testing. This should be part of your company's culture. I think there's a lot you can learn from this, and it's fun too. And I think sharing the results of your A B tests are also a great activity to share with your coworkers. So that's the end of my talk. Here are some places to find me. Thanks very much for listening and I'll see you next time.

Bill Coloe

Front End Optimization Engineer @ Lovevery

Bill Coloe's LinkedIn account Bill Coloe's twitter account

Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways