Skip to main content

How GeoLift tests work

The geography-based method Triple Whale uses to measure incrementality, and what makes a test valid.

K
Written by Kassandra Villa Arroyo

Tl;dr

A GeoLift test measures incrementality by running your advertising in one set of regions (test) and pausing or reducing it in a comparable set (control), then comparing revenue between them. After accounting for baseline differences between the regions, the remaining gap is the lift your advertising caused. Triple Whale uses the geo-based approach because it avoids audience spillover, works across all channels at once, and does not depend on platform holdout features.

Overview

This article explains how a GeoLift test works, so you understand what the test is doing before you read its results. It covers why Triple Whale tests by geography, how test and control regions work, how regions are selected, and how a test's required duration is determined. This article is written for marketers and analysts planning or reviewing an incrementality test.

GeoLift is also referred to as a geo holdout test.

Key terms

  • GeoLift: Triple Whale's geography-based incrementality test, found under Incrementality, New Lift Test, Test Design.

  • Test group: The regions where your advertising keeps running during the test.

  • Control group (holdout): Comparable regions where advertising is paused or reduced.

  • Synthetic control: A model of what the test regions would have done without the change, built from the control regions' behavior.

  • Marketing contribution: The revenue your advertising actually caused, measured as the gap between actual results and the synthetic control.

  • Feasibility report: A pre-launch estimate of whether the test can detect an effect at your budget level.

How it works

A GeoLift test splits comparable regions into two groups. The test group keeps running the advertising you want to measure, and the control group holds it out by pausing or reducing spend.

Triple Whale then builds a synthetic control: a model of what the test regions would have done without the change, based on the behavior of the control regions. The difference between what actually happened in the test regions and what the synthetic control predicted is the effect your advertising caused, your marketing contribution.

How regions are selected. Regions are matched so the test and control groups behave similarly before the test begins. Good matching is what makes the comparison valid.

How long a test needs to run. Before launch, Triple Whale produces a feasibility report that estimates whether the test can detect an effect at your budget level. You are told this before the test starts, not after.

When to use / trade-offs

There are two main ways to run an incrementality test: by audience (show ads to some users, withhold from others) or by geography (run ads in some regions, hold out from others). Triple Whale uses the geo-based approach for most tests because geo holdouts have several advantages over audience-based holdouts:

  • They are less affected by audience spillover, where users in the holdout see ads through other means.

  • They work across all channels simultaneously.

  • They do not depend on platform-specific holdout features.

šŸ’” On inconclusive results: A test that comes back "inconclusive" is not a failure. It means the sample size was not large enough to detect the effect. This is valuable information: it tells you the test design needs adjustment, not that the channel has zero impact.

Related questions

  • What is a synthetic control in a GeoLift test?

  • How are test and control regions selected?

  • What does an inconclusive result mean?

  • Why does Triple Whale test by geography instead of by audience?

Related articles

Did this answer your question?