The problem with rating averages
Think of the typical eCommerce site, which often features products that are sorted based on their ratings and reviews. Ranking by rating scores seems straightforward; however, there are some hidden nuances in performing the calculations. For example, consider two products: A and B. Product A has just three ratings, all 5-star, while product B has one hundred ratings with ninety-seven 5-star ratings and three 4-star ratings. If the site simply takes the average of the ratings and ranks the two products, product A would rank higher (at an average of 5 stars) than product B (at an average of 4.97 stars). However, product B, with its 100 ratings is more likely to possess true 5-star quality because it has been vetted more thoroughly, and should be ranked higher than product A.
We have a very similar problem when ranking which ads are effective in a campaign. For example, most direct response advertisers would like to know which ads in which target audiences are resulting in the most conversions (which can mean anything from a membership sign-up, app download, or a product purchase). To compare effectiveness, performance marketers may calculate the conversion rate (number of conversions divided by the number of clicks) for each ad. If ad A had 3 clicks and 1 conversion (1/3 conversion rate), is it worse or better than ad B with 100 clicks and 3 conversions (3/100 conversion rate)?
Where data confidence comes in
This is where a statistical framework can help. A confidence interval can be used to indicate the range of statistically possible conversion rates for an ad. The range is tighter when the ad has had a lot of clicks. Conversely, the range is broader when the ad has only a few clicks. Therefore, ad A’s extremely low amount of clicks makes it an “unknown” variable, and its true performance can’t be accurately predicted. At the same time, ad B’s 100 clicks are a much better indicator of true performance and allow for a much more accurate guess. Therefore, when comparing two ads, if we compare the confidence intervals of the conversion rate instead of the single point of the calculated conversion rate, we get a much clearer idea of how the two ads compare.
Rule of Three
Is there an easy way to figure out the confidence interval? A really useful rule of thumb is the “Rule of Three.” If no conversions have happened after some number of clicks, you can divide 3 by the number of clicks so far to estimate the rightmost point of the confidence interval with 95% confidence (i.e. the best case conversion rate). For example, if 300 clicks have happened with no conversions, you can estimate that the true conversion rate is between 0 and 3/300 = 0.01 with 95% confidence.
What does this mean for me?
- We use this “data confidence” approach in our Creative Tester tool, where we compare ads and pausing down underperforming ads
- When you are manually harvesting “good” ads and pausing down “underperforming” ads, remind yourself of the rating score ranking problem and don’t fall into the same trap! Use the Rule of Three when you can, and see if the conversion rate range is consistent with your expectations.