How multivariate testing can take your website optimization to the next level

The exciting thing about conversion rate optimization (CRO) is that it helps you figure out what works on your website and what doesn’t. Through thorough testing, you can delve deep into the psychology of shoppers and create the experiences that delight your customers and help your business really succeed.

CRO can be a real eye-opener for marketers and designers alike, arming them with knowledge of what headline, what image, what button copy, what layout and more can help to give you the best chance of turning visitors into customers.

Because of the potential impact that CRO offers, it can be tempting to dive straight in and start throwing tests together. A/B testing can give you incredible insights and it’s only natural to want to test as many things as possible, as quickly as possible. After all, the more you test, the more you learn. But for many websites, starting off with A/B testing can sometimes feel frustrating due to the limitations of only being able to test one element on the page at a time. This is where, when done well, multivariate testing can really help you step up your optimization efforts.

What is multivariate testing?

Multivariate testing is a form of split testing, where you test multiple different versions of a web page against each other, showing different customers different designs and measuring which one delivers the best results for your business. As the name suggests, multivariate involves multiple variations of a page being tested all at the same time, helping you to identify the optimal combinations and how different elements work together on the page.

Multivariate testing vs A/B testing

Perhaps the easiest way to understand multivariate testing is to compare it A/B testing, its close relative and the more common and simpler version of testing. Different elements on your web page can have an impact on user behavior. For example, the hero image at the top of your homepage is often the first thing that users see when they land on your site. You might want to find out what photo is the most engaging and is most likely to encourage people to click through to explore the products on your site. A simple A/B test would be to use two different photos and measure the impact, using A/B testing software to deliver two live versions of the page.

Visitors are put into two different ‘buckets’; half will see one picture (version A), the other half will see the other (version B). Your testing software or analytics platform can then measure which photo has the biggest impact on the key metrics you want to measure. As more and more visitors hit your site, you’ll start seeing that one version performs better than the other.

You should let your test run for long enough to achieve ‘statistical significance’ – this gives you a high degree of confidence that the difference in results is not just down to chance and that permanently implementing the winning variation would have a positive impact on your site. Statistical significance is worked out using formulas involving the number of visitors, the metrics you are tracking, and the measurable difference between each version.

There are different statistical theories over how to most accurately measure significance. We won’t go into the details of all of those now, other than to say Sentient Ascend leverages Bayesian statistics in its intelligent learning model. This is because other more traditional statistical models are impossible to apply to the vast number of possible variations that the platform offers. The Bayesian model, combined with AI-powered machine learning, allows Ascend to more accurately identify the most promising design iterations, without waiting for traditional statistical significance, allowing it to move quicker onto developing subsequent high-impact variations – it is this speed of learning that provides such a reliable base for determining winning website designs.

So, to recap, A/B testing involves testing two different versions of an element on your page. If you have enough traffic, you can also extend this to run A/B/C or A/B/C/D tests, and so on. Using the hero image example, you would run 3, 4, 5 or more versions of the image all at the same time.

But with A/B testing, if there is more than one element on the page that you want to test, then you will need to run a new test after the first one has finished. If you were to test more than one element on the page at the same time, with your audiences crossing over between both tests, then you would not know which change was driving the performance improvements.

For example, say you want to know whether the ‘SHOP NOW’ call to action button which is laid over the photo can be improved and decide that you want to test a different color to see if that attracts more clicks.

If you were to run this test at the same time as the hero image test, the results would not be reliable, because you would have no way of knowing whether it was the image or the button that had an impact on user behavior. That means you need to wait until the image test concludes before you can run the button test. If you also want to test the headline or wording above the button, again you would need to wait until the button test is finished. Depending on your traffic levels, and the detectable differences between the variations, it could take a month or more for each test to reach a conclusion. That means you’re looking at more than three months just to figure out the optimal version of your hero image, headline and call to action button. It is having to take this rigidly methodical approach that can be frustrating for optimizers and often proves a real drain on time resources.

It also means you’re potentially missing out on finding potentially even better combinations of elements, in this case the copy that goes best with each image. For example, say your change the image because the first test showed that a new image performed better. Then you test copy and find that the new copy works better, so you change that. But what you haven’t tested is the original image with the new copy – they may prove to be an even more powerful combination, because different copy can work with different styles of images. But rigid A/B testing schedules often deny you the chance to test out different combinations, as well as different individual elements.

When it comes to website optimization, time is money. Every month that goes by without you finding a better version of a web page, you are effectively leaving money on the table. So the quicker you can optimize your pages the better. Imagine if you could have tested all three of those elements at the same time. You would have identified a much better performing version of the page in much less time.

Discovering hidden winners with multivariate testing

There is also another limitation to A/B testing, beyond just the time it can take to optimize an entire page. Once you have declared a winner for one test and pushed the winning variation permanently live, the losing variation is now discarded and off the table. Nothing wrong with that, you might think. But that means you never get the chance to test how that variation performs when presented alongside other changes on the page.

For example, hero image B might have lost the test when it was combined with button color A and headline A. But what if hero image B, when used with button color B and headline B, would actually have out-performed the original test winner? Or hero image B, with button color A and headline B? And so on. Elements on a page don’t exist in isolation, so from a design perspective, it is also important to learn how different elements interact and impact each other, to discover which combinations work well together.

Unlike A/B testing, multivariate testing allows you to test these different combinations at the same time. Testing three different elements (image/button/headline), with two versions of each, this would actually provide eight combinations (2 x 2 x 2 = 8) – so you would be running eight different versions of your page all at once. In the previous example of image/button/headline, the eight different variations would look like this:

1)    Hero Image A + Button Color A + Headline A

2)    Hero Image A + Button Color A + Headline B

3)    Hero Image A + Button Color B + Headline A

4)    Hero Image A + Button Color B + Headline B

5)    Hero Image B + Button Color A + Headline A

6)    Hero Image B + Button Color A + Headline B

7)    Hero Image B + Button Color B + Headline A

8)    Hero Image B + Button Color B + Headline B

Generally, your audience would be divided into eight buckets, each bucket seeing a different version of the page. This allows you to find optimal designs quicker, discover more subtle insights into how elements interact with each other and learn what elements on a page have the biggest impact.

Winning designs that are one in a million

When you think about how many elements there are on a typical web page, just testing three barely scratches the surface. Even adding one extra version of each element quickly grows the number of potential combinations from eight to 27 (3 x 3 x 3 = 27). Start adding other elements into the testing pot too and it’s easy to see how you can then generate hundreds, thousands or even millions of combinations and versions of your pages. Check out Sentient’s example of how you can generate 1 million combinations from a standard web page layout, before you’ve even got below the fold.

It might sound unmanageable to set up thousands of combinations of a page, or even to think of that many options. But Sentient Ascend uses AI and combines it with your creative ideas and hypotheses for what to test, conceiving and building combinations much quicker and allowing you to test all your ideas at once, actively finding the optimal combinations and intelligently identifying what works.

Advantages of multivariate testing

As we’ve seen, multivariate testing allows you to test many more changes all at once. This is a great time-saver, as it’s the equivalent of running numerous A/B tests one after the other, but in a much shorter timeframe.

Even better than that, it also allows you to test combinations of elements that may otherwise have been discarded if running a traditional A/B testing program. This allows you to discover more subtle design changes that can positively impact your website’s performance.

With A/B testing, you are often relying on big wins to justify the test, because it is so much harder to find those wins when you’re testing one thing at a time. Whereas multivariate testing can deliver multiple incremental improvements, that can quickly add up to deliver long-term benefits. It also means you have much better chance of hitting winners, because you are testing so much more. If you throw enough balls, you’re going to win a coconut. With A/B testing you’re throwing one ball at a time, hoping to hit. But you might just hit a small coconut, leaving the bigger ones still up for grabs. multivariate testing is like throwing 100 balls at a time, so the odds of you hitting are much higher. You’re also increasing your chances of hitting a bigger coconut, which in testing terms means you have increased your odds of hitting those big winners.

Limitations of multivariate testing

The biggest barrier to most people running multivariate testing is traffic. That is because the more versions of a page you test, the thinner you divide your traffic. Every time you add in another variation, you increase the audience size needed to get reliable insights.

Requiring bigger sample sizes also often needs more time for your tests to run so that you can attract enough traffic. But the advantage is you can learn more in one test than you can in one A/B test.

Managing multiple combinations can also use up your testing resources, such as the staff resources needed to set up the tests or come up with the design treatments and build the tests. Getting a developer to build one variation for an A/B test may seem reasonable – but then to need them to build 27 variations for a multivariate test becomes a much more time-consuming prospect. Then there is the additional brainpower needed for analyzing the results of comparing multiple variations.

Sentient Ascend helps to solve these issues and make multivariate testing more accessible to a wider range of businesses. It uses evolutionary computation to build the variations, without a requiring human developer resource to do it. It also cuts down the time and traffic needed by using AI-powered algorithms to identify which combinations are promising and which combinations can be quickly discarded.

The automatically generated combinations also do away with the need for extensive human developer resource and also allows you to discover unexpected interactions between elements, even winning interactions that may never have conceived by the human brain.