Statistical Significance – What Is It & How Can It Prevent Bad Optimization?

Dave Hamrick Product Listing Optimization, Split Testing Best Practices 1 Comment

Sharing is caring!

Does the mere mention of statistics bring you back to your school days, when mathematics came back with a vengeance? Thankfully, Splitly scooped up a couple of the rare few who actually got kicks out of those brain-frying equations and algorithms. Little did anyone realize back then that statistical significance would become a fundamental principle of our Amazon optimization process.

But honestly, you don’t need to be freakishly smart to understand statistical significance or how optimization is performed. Splitly can still be used by those who have very little knowledge of how it actually works. Yet, having a basic understanding of why it works enables users to super-charge their optimization process and make changes with greater confidence.

So today we’re here putting statistical significance in “plain-English” and explaining why it’s so critical when optimizing our Amazon listings.

Quick Recap – How We Optimize Amazon Listings?

First of all, let’s make sure you’re up to speed with the most effective method for optimizing listings today.

We use Split Testing (or AB testing). Put simply, this is the act of running two variants against each other and seeing which performs the best.

By creating a variation of our Amazon listing and switching it in place of the original we can measure changes in performance.

How Does Split Testing Work?

Split testing is a mathematical process that provides certainty in a result based on probability.

The easiest way to understand this concept is to imagine the tossing of a fair coin with Heads and Tails.

The diagram above shows the probability of achieving results in this setup. You’ll notice that the 1st result has a 50/50 chance either way, but this then splits down to 25% chances for each branch.

Now let’s look at probability distribution in the graph above. As you can see, if the coin is flipped 100 times it has the highest probability of achieving heads 50 times (and therefore tails 50 times too). Then the probability of achieving different results reduces as you move away.

It’s worth noting that flipping the coin more times enables us to gather more data and a greater probability distribution.

Still with us?

Now coming back to Amazon, imagine you are changing your listing back and forth between 2 variants. They “should” perform the same, unless one has a specific advantage. Any differences we can measure can indicate one is performing better than the other.

How Do We Measure Our Results?

When optimizing our listings using split testing, we must choose a single objective and use it to benchmark our results. On Amazon, we focus on the following two metrics:

  1. Increase Conversion Rate (CR) – If your listings have a higher conversion rate, you will make more sales from the same volume of traffic.
  2. Increase Click-Through Rate (CTR) – If you improve the likelihood of visitors clicking your listings, you will get more traffic (and sales as a result).

Primarily we focus on optimizing for conversion, however, optimizing your Amazon listing for traffic is also worth revisiting later.

How Can We Perform Split Testing On Amazon?

If you want to run split testing on your products manually, you’ll need to perform the following:

1. Switch Product Listing Variant Daily – Go to Seller Central and modify your product listing directly.

2. Collect Session Data Daily – Collect sessions and sales data daily, export it into an excel document to calculate the conversion rate.

3. Analyze Your Data – Once you’ve collected sufficient data, you can begin to measure statistical significance to see if you can have confidence in your results. You can use this online tool.

Sounds like a lot of manual work and calculations right?

We think so too, especially when all of this can be achieved on autopilot using Splitly:

  • Sign up to Splitly and connect your Amazon account
  • Create listings variants using Splitly and schedule testing to run
  • Come back and check results at your leisure
  • Select a winner based on performance, set up another split test and continue!

Splitly switches between variants every day, collects all data from Amazon then plots the results ready for you to analyze.

Now let’s look closer at statistical significance and it’s major role in split testing and optimization.

Achieving Certainty in Results – Statistical Significance

As the name suggests, statistical significance is a calculation used to decide how significant your results are. In other words, it provides confidence that your conclusions are trustworthy and not just a fluke.

Let’s say you’re running a test where variant A and variant B have something different about them. You notice that variant A performs better than variant B. Statistical significance is how sure we are that variant A is better than variant B.

Splitly’s software calculates statistical significance using the Welch’s T-test method. If you’re feeling geeky, you can go and read up about the calculations and equations behind it.

How Much Certainty Do We Need?

Going back to the coin example, if we only flipped it 10 times and it gave heads 9 times – can we confirm that it’s an unfairly weighted coin? Technically not, because there is still a tiny chance (.00195%) flipping a fair coin could have achieved this same result.

That being said, no matter how many times we flip a coin or how many data points we collect, we’ll never reach 100% certainty. We just need to get close enough to make an educated decision.

By default, Splitly uses a 90% minimum statistical significance until you can confidently conclude a test.

What Variables Can Affect Statistical Significance?

One of the biggest mistakes sellers make with split testing is ending their tests too early. This can result in bad decision making and jump to conclusions too soon.

Higher values indicate more trustworthy results but it comes at a cost of longer test durations. These are two main factors that come into play:

  1. Sample Data – More data will help you reach higher statistical significance faster. This is tied to the popularity of your product and how fast it’s selling (more sessions and conversions).
  2. The Difference – The size of the difference between two variants will also affect and influence the results. A greater difference in performance between varients helps you reach higher statistical significance faster.

Here are a couple of examples to further illustrate this in action:

  • Scenario 1: Your product gets lots of traffic and makes good sales, plus you a large difference between each test variants. You’ll be able to reach higher statistical significance faster.
  • Scenario 2: Your product doesn’t get much traffic or make many sales, plus the difference between you variants is small. You might struggle to ever reach a high level of statistical significance!

Best Practices For Reaching High Statistical Significance

The quicker you can successfully move through tests, the quicker you will optimize your listings. For this reason, we recommend the following best practices:

  • Start with your best performing listings – you’ll acquire more data quicker and you may unlock even more revenue from your best earners.
  • Start with the biggest changes then whittle down – you’ll be able to complete tests within a couple weeks and optimize faster, rather than waiting a month or longer between tests.
  • Aim for 90% as a minimum, but 95% is optimal – while it’s tempting to end your test early, hang on until at least 90% to prevent bad decisions.

The Most Common Mistake – Trying To Revive Losers

Many FBA sellers fall into the trap of trying to optimize their worst performing listings to “bring them to life”. They don’t even want to touch their best earners in fear of “breaking” them. But in actual fact, doing the opposite would be a smarter play.

Split testing doesn’t work well for struggling products. It cannot acquire data quickly and could take (literally) forever to reach statistical significance. It’s simply not a good use of time.

Further optimizing the best listings offers greater ROI and faster results. Remember, just because a listing is already doing well doesn’t mean there isn’t more juice that could be squeezed out of it.

That’s Statistical Significance Nailed!

If you’ve managed to keep up with us today, congratulations! You’re now more clued up on listing optimization that the vast majority of FBA sellers.

With your newfound understanding of split testing principles, you will fly through the process faster, make more confident decisions and avoid rash mistakes. So start running split tests today, squeeze more revenue out of your top earners and finally dominate in your niche!

Sharing is caring!

Comments 1

Leave a Reply

Your email address will not be published. Required fields are marked *