Recipe for implementing a homemade A/B testing engine

Published in

Studocu Tech

7 min readAug 3, 2021

In our previous article, we talked about the lifecycle of A/B tests and went through an in-depth example of what a tracking plan should be. But what about the technical implementation? At StuDocu, we quickly realised that a custom, in-house implementation coupled with a solid tracking infrastructure would give us the latitude to experiment without limits.

In this article, you will find all the ingredients you need to develop a sustainable A/B testing engine, and we will go over the questions you should ask yourself prior to implementing it.

Configuration

First, you need a way to set configuration options for each of your A/B tests. You could use an entry per A/B test in a configuration file, a class, some sort of table structure in your database, or anything you can think of.

An A/B test configuration should consist of the following elements:

A name. For instance, “Test Article Follow Button”, as showcased in the previous article of this series.
A list of variant names. In our example, “Light” and “Green”.
The chances a user has to see one or the other variants. In our example, 50/50. However, if your A/B test is risky, you could consider running it on an 80/20 split, for instance.
Which of the variants is the control variant. In our example, “Light”.
Optionally, a boolean to decide whether the variant assigned to a user should be persisted in the database. In our example, we would certainly want that, so the user has a consistent experience across his visits on multiple devices — but I will expand on that a bit later.
Optionally, a description explaining what it is about. In our example, the description could be something like “Test updating the follow button design”.

With such a configuration, you have everything you need to handle your A/B tests and compute which variant a user should see.

Management system

Toggle

Before diving into the algorithm, two important notes:

A/B tests can introduce unexpected behaviours;
After a while, you might realise that the control variant performs better than the new variant(s).

In both cases, you do not want to wait for the A/B test to be terminated with a new release. This is why, at StuDocu, we decided to implement a management system where A/B tests can be enabled or disabled easily without the need to deploy a new version of the code.

Going further

Ideally, new variants would outperform control variants in most of your A/B tests. In such cases, you might also not want to wait for a new release for the improvement to be rolled out to all of your users. To do so, your management system would need to have a feature to mark a variant as the winning variant.

Algorithm

With the configuration and the few notes above in mind, we can construct the decision tree the algorithm should go through to assign a user a variant of an A/B test.

Browser-specific consistent experience

If you do not care about your users having a consistent experience across browsers and devices, the flowchart is actually quite simple. Also, in such a case, the boolean configuration to decide whether the variant assigned to a user should be persisted in the database or not is not needed.

A simple version of the A/B test algorithm flowchart

Implementing this flowchart will make sure that your users have a consistent experience across their multiple visits on the same browser, but not if they use another device, for instance.

The dilemma of a fully consistent experience

If you want your users to have a consistent experience when they log in using another browser or device, there is still a tricky question to answer. When running an A/B test for both logged out users (referred to as guests from now on) and logged in users (shortened as users from now on), do you want users to have the same experience as before they logged in, or as when they used the platform on another browser or device?

Say, for instance, that one of your users (Jean) saw one of your A/B tested pages on his favorite browser. The next day, he uses his friend’s (Eva) computer, who happens to use the website as a guest and sees another variant. First, Jean uses the website on Eva’s computer as a guest and experiences the website with another variant than the one he is used to. When he finally logs in, what should happen? Should he be switched to the variant he was assigned on his favorite browser, or should the variant remain the same as when he was a guest on Eva’s computer? In the same vein, if Jean now uses his phone and is also assigned another variant than on his favorite browser, which variant should he see when he logs in? That new variant, or the first initial variant?

If you want the experience to remain the same per browser, you can, once again, implement a fairly simple version of the flowchart. You only need to make sure to configure the A/B tests to never use the database when it also runs for both logged-in and logged-out users.

A middle-ground version of the A/B test algorithm flowchart

Guest vs user variants

If you want the experience to remain the same per user, or if you cannot decide and want to support both options, you need to make a distinction between the “guest variant” and the “user variant”, as shown in the following flowchart. These variants are stored under different keys in the session and cookies in order for multiple users to be able to use the same browser and still see their own variant. In the example showcased in the previous article, the keys could be, for instance:

test_article_follow_button_guest_variant for the guest variant,
test_article_follow_button_user_[user_id]_variant for the user variants.

A complete version of the A/B test algorithm flowchart

With this implementation, you are actually able to support a consistent experience with both the browser-based and user-based strategies. When you want your A/B test to use the browser-based strategy, simply set the boolean to decide whether the variant assigned to a user should be persisted in the database or not to false. When you want your A/B test to use the user-based strategy, set it to true.

Going further

You can go further by implementing:

a variant safety check, to make sure that the retrieved variant does exist — you can find such flow, alongside high-resolution versions of the above flowcharts, on this Miro board;
a new path to support forcing a variant, for internal and testing purposes;
a new path to support setting the winning variant, as explained earlier.

Usage

After implementing the decision tree of your choice, the final method is ready to use. Depending on the strategy, it should have one or two inputs: the A/B test name, and optionally the currently logged-in user. Once the A/B test is handled and the variant computed and returned, you can do anything you want depending on its value.

The power of a custom, in-house A/B testing engine is there: you can conditionally run any specific logic depending on the variant — not only presentational, but also algorithmic.

Here are a few guidelines to keep in mind when implementing a new A/B test:

To avoid unnecessary computation time, only handle the A/B test when necessary. If your A/B test runs on a specific page under specific conditions, handle it if and only if the criteria are met. This will also allow you to track the A/B test only when your users are actually impacted by the A/B test.
Only implement custom logic checking if the current variant is not the control variant or if the current variant is one of the new variants. The control variant should always be the default, unconditional path.
Try and comment the name of the test on the pieces of code that will need cleaning up when removing it from the code. It will save the future developers time, and reduce the risk of keeping dead code.

Conclusion

A custom, in-house A/B testing engine allows your team to implement the widest range of A/B tests. For instance, at StuDocu, we ran an A/B test to compare two recommendation algorithms at the same time. This would be trickier without our homemade implementation, but also without our robust tracking plan, which we explained in the previous article, and our product analytics tool. We are able to run tailored A/B tests both in our back-end and our front-end, and easily monitor their performance at any time. You can learn more about the A/B tests we run in our Mixpanel case study, as well as in this webinar about marketing and product analytics.

We are planning to release a ready-to-use A/B testing package for Laravel in the future. If you cannot wait or are not using Laravel, I hope this article gave you enough insights to start implementing the algorithm yourself!

Recipe for implementing a homemade A/B testing engine

Configuration

Management system

Toggle

Going further

Algorithm

Browser-specific consistent experience

The dilemma of a fully consistent experience

Guest vs user variants

Going further

Usage

Conclusion

Written by Killian Saint cricq