I remember running my first A/B test after college. It wasn’t till then that I understood the basics of getting a big enough A/B test sample size or running the test long enough to get statistically significant results.

But figuring out what “big enough” and “long enough” were was not easy.

Googling for answers didn’t help me, as I got information that only applied to the ideal, theoretical, and non-marketing world.

Turns out I wasn’t alone, because asking how to determine A/B testing sample size and time frame is a common question from our customers.

So, I figured I’d do the research to help answer this question for all of us. In this post, I’ll share what I’ve learned to help you confidently determine the right sample size and time frame for your next A/B test.

**Table of Contents**

**A/B Test Sample Size Formula**

When I first saw the A/B test sample size formula, I was like, woah!!!!

- n is the sample size
- 𝑝1 is the Baseline Conversion Rate
- 𝑝2 is the conversion rate lifted by Absolute “Minimum Detectable Effect”, which means 𝑝1+Absolute Minimum Detectable Effect
- 𝑍𝛼/2 means Z Score from the z table that corresponds to 𝛼/2 (e.g., 1.96 for a 95% confidence interval).
- 𝑍𝛽 means Z Score from the z table that corresponds to 𝛽 (e.g., 0.84 for 80% power).

Pretty complicated formula, right?

**Luckily, there are tools that let us plug in as little as three numbers to get our results, and I will cover them in this guide.**

*Need to review A/B testing key principles first? This video helps.*

**A/B Testing Sample Size & Time Frame**

In theory, to conduct a perfect A/B test and determine a winner between Variation A and Variation B, you need to wait until you have enough results to see if there is a statistically significant difference between the two.

Many A/B test experiments prove this is true.

Depending on your company, sample size, and how you execute the A/B test, getting statistically significant results could happen in hours or days or weeks — and you have to stick it out until you get those results.

For many A/B tests, waiting is no problem. Testing headline copy on a landing page? It‘s cool to wait a month for results. Same goes with blog CTA creative — you’d be going for the long-term lead generation play, anyway.

But certain aspects of marketing demand shorter timelines with A/B testing. Take email as an example. With email, waiting for an A/B test to conclude can be a problem for several practical reasons I’ve identified below.

**1. Each email send has a finite audience.**

Unlike a landing page (where you can continue to gather new audience members over time), once you run an email A/B test, that‘s it — you can’t “add” more people to that A/B test.

So you’ve got to figure out how to squeeze the most juice out of your emails.

This will usually require you to send an A/B test to the smallest portion of your list needed to get statistically significant results, pick a winner, and send the winning variation to the rest of the list.

**2. Running an email marketing program means you’re juggling at least a few email sends per week. (In reality, probably way more than that.)**

If you spend too much time collecting results, you could miss out on sending your next email — which could have worse effects than if you sent a non-statistically significant winner email on to one segment of your database.

**3. Email sends need to be timely.**

Your marketing emails are optimized to deliver at a certain time of day. They might be supporting the timing of a new campaign launch and/or landing in your recipient‘s inboxes at a time they’d love to receive it.

So if you wait for your email to be fully statistically significant, you might miss out on being timely and relevant — which could defeat the purpose of sending the emails in the first place.

That’s why email A/B testing programs have a “timing” setting built in: At the end of that time frame, if neither result is statistically significant, one variation (which you choose ahead of time) will be sent to the rest of your list.

That way, you can still run A/B tests in email, but you can also work around your email marketing scheduling demands and ensure people are always getting timely content.

So, to run email A/B tests while optimizing your sends for the best results, consider both your A/B test sample size *and* timing.

Next up — how to figure out your sample size and timing using data.

**How to Determine Sample Size for an A/B Test**

Let me show you how you calculate it.

**1. Check if your contact list is large enough to conduct an A/B test.**

To A/B test a sample of your list, you need a list size of at least 1,000 contacts.

For you, your results might not be statistically significant at the end of it all, but at least you’re gathering learnings while you grow your email list.

**Pro tip: **If you use HubSpot, you’ll find that 1,000 contacts is your benchmark for running A/B tests on samples of email sends. If you have fewer than 1,000 contacts in your selected list, Version A of your test will automatically go to half of your list and Version B goes to the other half.

**2. Use a sample size calculator.**

HubSpot’s A/B Testing Kit has a fantastic and free A/B testing sample size calculator.

During my research, I also found two web-based A/B testing calculators that work well. The first is Optimizely’s A/B test sample size calculator. The second is that of Evan Miller.

For our illustration, though, I’ll use the HubSpot calculator. Here’s how it looks like when I download it:

**3. Input your baseline conversion rate, minimum detectable effect, and statistical significance into the calculator.**

This is a lot of statistical jargon, but don’t worry, I’ll explain them in layman’s terms.

**Statistical significance:** This tells you how sure you can be that your sample results lie within your set confidence interval. The lower the percentage, the less sure you can be about the results. The higher the percentage, the more people you’ll need in your sample, too.

**Baseline conversion rate (BCR):** BCR is the conversion rate of the control version. For example, if I email 10,000 contacts and 6,000 opened the email, the conversion rate (BCR) of the email opens is 60%.

**Minimum detectable effect (MDE): **MDE is the minimum relative change in conversion rate that I want the experiment to detect between version A (original or control sample) and version B (new variant).

For example, if my BCR is 60%, I could set my MDE at 5%. This means I want the experiment to check whether the conversion rate of my new variant differs significantly from the control by at least 5%.

If the conversion rate of my new variant is, for example, 65% or higher, or 55% or lower, I can be confident that this new variant has a real impact.

But if the difference is smaller than 5% (for example, 58% or 62%), then the test might not be statistically significant as the change could be because of random chance rather than the variant itself.

MDE has real implications on your sample size in terms of time required for your test and traffic. Think of MDE as water in a cup. As the size of the water increases, you need less time and effort (traffic) to get the result you want.

The translation: a higher MDE provides more certainty that my sample’s true actions have been accounted for in the interval. The downside to higher MDEs is the less definitive results they provide.

It‘s a trade-off you’ll have to make. For our purposes, it’s not worth getting too caught up in MDE. **When you‘re just getting started with A/B tests, I’d recommend choosing a smaller interval (e.g., around 5%).**

Note for HubSpot customers: The HubSpot Email A/B tool automatically uses the 85% confidence level to determine a winner..

#### Email A/B Test Example

Let’s say I want to run an email A/B test. First, I need to determine the size of each sample of the test.

Here‘s what I’d put in the Optimizely A/B testing sample size calculator:

Ta-da! The calculator has shown me my sample.

**In this example, it is 2,700 contacts per variation.**

This is the size that *one* of my variations needs to be. So for my email send, if I have one control and one variation, I‘ll need to double this number. If I had a control and two variations, I’d triple it.

Here’s how this looks in the HubSpot A/B testing kit.

**4. Depending on your email program, you may need to calculate the sample size’s percentage of the whole email.**

HubSpot customers, I‘m looking at you for this section. When you’re running an email A/B test, you’ll need to select the percentage of contacts to send the list to — not just the raw sample size.

To do that, you need to divide the number in your sample by the total number of contacts in your list. Here’s what that math looks like, using the example numbers above:

**2700 / 10,000 = 27%**

This means that each sample (both my control AND variation) needs to be sent to 27-28% of my audience — roughly 55% of my list size. And once a winner is determined, the winning version goes to the rest of my list.

And that’s it! Now you are ready to select your sending time.

**How to Choose the Right Timeframe for Your A/B Test for a Landing Page**

Calculating the time I need is easy. Here’s an example:

So the time frame for the test will be:

This implies I should start running this test within the first two weeks of September.

**Choosing the Right Timeframe for Your A/B Test for Email**

For example, what percentage of total clicks did you get on your first day?

If you have a large sample size and found a statistically significant winner at the end of the testing time frame, many email marketing tools will automatically and immediately send the winning variation.

If you have a large enough sample size and there’s no statistically significant winner at the end of the testing time frame, email marketing tools might also allow you to send a variation of your choice automatically.

If you have a smaller sample size or are running a 50/50 A/B test, when to send the next email based on the initial email’s results is entirely up to you.

If you have time restrictions on when to send the winning email to the rest of the list, figure out how late you can send the winner without it being untimely or affecting other email sends.

For example, if you‘ve sent emails out at 3 PM EST for a flash sale that ends at midnight EST, you wouldn’t want to determine an A/B test winner at 11 PM Instead, you‘d want to email closer to 6 or 7 PM — that’ll give the people *not* involved in the A/B test enough time to act on your email.