Evolution or Revolution? How Adaptive Evolutionary Optimization Fits into the World of Experimentation

As CRO experts, we are always looking for new ways to optimize our companies’ results through experimentation. A/B testing has been the traditional coin of the realm – a very small percentage of companies that have experimentation programs use multivariate testing, generally due to lack of traffic.

Along comes artificial intelligence and new methods to consider for optimization, from the multi-armed bandit approaches that debuted a few years back to new methods based on evolutionary algorithms as shown with Sentient Ascend.

What do these new approaches mean for conversion professionals? When are they most valuable and when shouldn’t they be deployed? How does evolutionary optimization fit in with A/B testing and other popular means of experimentation? How can or should our processes change based on these new approaches?

On the eve of CXL Live, we thought these would be interesting discussion points to raise, to stimulate dialogue both at the conference and more broadly through the conversion community.

What is adaptive evolutionary optimization and how does it compare to A/B and multivariate testing methods?

Adaptive evolutionary optimization uses algorithms modeled on natural selection to test the impact of a large (8 to 50) set of individual changes, using an evolutionary process to efficiently search through the space of all the combinations of these changes (100s to millions of designs) towards a specific goal (e.g., conversion rate increase).

To enable this approach towards experimentation, the system automates the creation of individual candidate designs. The algorithms very efficiently identify which changes are affecting KPIs positively and create new candidate designs that consist of the highest performing changes.

The AI uses a combination of Bayesian and traditional statistical techniques to predict the performance of the winning designs. It utilizes “weak” signals (compared to A/B testing’s 95% or 99% confidence signals) and aggregates them across multiple designs and generations to produce a strong signal by the end of the experiment.

As compared to A/B and multivariate testing techniques, adaptive evolutionary optimization offers a set of interesting capabilities:

Test Capacity: As mentioned above, this approach, depending on user traffic and conversion rates, can run experiments on 8 to 50 or more individual ideas at once, which represents hundreds to millions of designs.

Test Velocity/Efficiency of Traffic Use: Evolutionary algorithms can search towards the optimum through these “search spaces” of thousands or more potential designs while only actually testing a small fraction of the them along the way, allowing you to test more ideas per unit quantity of traffic than other methods.

Win Rate: The “portfolio” approach of testing using evolution to test multiple ideas at once results in 80-90% win rates, compared to the 15-20% industry standard win rates of A/B testing.

Average Lift per Experiment: This is all over the map as with any testing method, it depends on where you start and the quality of the ideas. Most of the winning designs from evolution incorporate multiple ideas for improvement, pushing average lifts higher than the average A/B test.

Adaptive to your Audience: Evolutionary optimizations, since they evolve over time, are more adaptive to changing audience conditions than other types of experiments which are anchored in a static moment.

Automation: The automation of experimentation, required to effect these evolutionary approaches, saves a lot of the resource time typically associated with chained A/B testing programs.

For more information on evolutionary algorithms, in general there’s an interesting blog article here.

Where Can Evolutionary Optimization be Valuable? When is it Better to A/B Test?

Even though we like to joke sometimes about the death of A/B testing, adaptive evolutionary optimization hasn’t killed A/B testing! It’s a new species of experimentation, and as such is specialized for certain environments and not others. How the multiple species of experimentation technologies evolve and co-exist over time will be interesting to watch, as each of us in the industry continue to hone our solutions.

When should evolutionary optimization be considered?

When Speed to Results is of the Essence: The efficiency of evolution allows us to test more ideas more quickly, which usually means faster and better results.

When Test Capacity is a Problem Bottleneck: If you have more ideas than you can process through A/B or other methods, evolutionary optimization can open up your ability to test more ideas.

When Prioritization is a Problem: One benefit of evolutionary optimization’s increased test capacity is the ability to reduce the strain and tension associated with prioritization of which ideas to test. Instead of picking one idea, the team can select 20 or 30 to test, allowing multiple voices to participate and be more involved and invested in the experimentation process.

When Win Rate is a Problem: The more ideas you can test at once, the more likely you will find a winner.

When Resources are Tight and You Have a Big Testing Roadmap: The automation required to do evolutionary optimization allows testing organizations to scale up with the increased test capacity.

When is A/B testing a better option?

When You Want a Definitive Answer to a Single Independent Question: Many times you are interested in a specific feature which doesn’t commingle well with other changes, e.g., a new recommendation system or a new search box. For these kinds of tests, A/B testing is the indicated method.

When Your Traffic Doesn’t Support Evolutionary Optimization: Evolution requires roughly 3,000 conversions a month (e.g, a sale, an Add to Cart, a signup) to operate. For sites below this range, A/B testing is a better option because it allows you to continue making data-driven decisions when evolutionary optimization is not available.

When You Have Multiple Strong Goals: Currently, evolutionary optimization is single goal focused, so is ideal for places where you have a predominating metric, like conversion rate or revenues per user. In some cases, though, you can have multiple equally-weighted goals. In these cases, A/B testing might be a better approach.

Our Processes are Evolving as Well

The optimization community is well known for reflecting on its own methods, as you might expect. The rise of evolutionary optimization as well as other new techniques is prompting a re-examination of experimentation processes that is a useful discussion. Some of the more interesting topics include:

  • How are our choices of what to test influenced by test capacity? E.g., some A/B testing solutions recommend not focusing on any individual change with less than a 1% lift possibility. Does evolutionary testing change how we think about ideation, e.g., granularity of ideas, number of variations per element tested, etc?
  • How do prioritization processes change when you open up test bandwidth? Are there fewer problems, or just different ones?
  • What’s the balance and tradeoff, if any, between speed of experimentation (and achievement of results) and learning along the way?
  • How does personalization fit into the worlds of AI and A/B testing?
  • What are the ramifications of automation on experimentation team formation?

We love to discuss these topics, so hope to engage with many of you through this blog and associated comments – and, if you’re there, at CXL drop by!