TL;DR

When your incrementality test finishes, Triple Whale shows how much revenue your advertising actually caused (marketing contribution), the percentage lift, and iROAS, alongside how confident the model is that the effect is real. Read four things together: marketing contribution, revenue lift, iROAS, and the probability of direction with its credible interval. A strong result has a meaningful positive contribution, a high probability of direction, and a credible interval that stays positive.

Overview

Incrementality tests answer one question: how many conversions or how much revenue did this channel actually cause, beyond what would have happened anyway? This article helps you read the results once a test finishes, so you can tell a trustworthy result from a noisy one and decide what to do next. It is written for marketers and analysts reviewing a completed GeoLift or Conversion Lift test. Results become available in the dashboard the day after the test ends.

Where to find your results

Open the test from the tests list. Results are organized into three tabs:

Results overview: your headline numbers, the attribution model comparison, and breakdowns by customer type and platform.
Deep dive: the revenue with versus without marketing chart, lift by period, and the marketing value distribution.
Configuration: the test setup and spend test tracker.

Key terms

Marketing contribution: the dollar value of revenue your advertising caused during the test, above the baseline. This is the headline result.
Revenue lift: the percentage increase in revenue in test regions attributable to advertising, above what control regions show. A 15% lift means the channel drove 15% more revenue than would have occurred without it.
Incremental ROAS (iROAS): incremental revenue divided by spend during the test period. This is the most important efficiency metric: it tells you the return on the dollars that were actually causing conversions, not just correlated with them.
Probability of direction: the chance that the true effect is positive, that marketing is genuinely adding value. The product shows this as a percentage, for example a 100% chance marketing adds value means essentially every outcome the model considers plausible is positive.
Credible interval (also shown as HDI, the highest density interval): the range the true value most likely falls within. A 90% credible interval means there is a 90% probability the true value sits inside that range, given the data. A narrower interval that stays positive means a more precise, more trustworthy result.
Synthetic control: a modeled estimate of what would have happened in your test regions without the spend change, built from your control regions. The gap between actual and synthetic control is your marketing contribution.

How it works

A GeoLift test splits comparable regions into two groups. The test group keeps running the advertising you want to measure, and the control group holds it out.

Triple Whale builds a synthetic control, a modeled version of what your test regions would have done without the change, from the control regions. The difference between actual results and the synthetic control is the effect advertising caused. From that difference, the model calculates your marketing contribution, revenue lift, and iROAS, and produces a probability of direction and a credible interval so you know how confident to be. Read those together, never the lift number on its own.

How to read each metric

Compare against your attribution models

The Results Overview shows your incremental (GeoLift) result side by side with how traditional attribution models would credit the same channel: First Click, Last Click, Linear All, Linear Paid, and Triple Attribution. This comparison shows how much each model over- or under-credits the channel relative to the true incremental effect. When a platform or attribution model reports a much higher ROAS than your iROAS, the gap is a direct measure of over-attribution, and the incremental number is the more trustworthy one for budget decisions.

Read the breakdowns

Below the headline numbers, results break down so you can see where the lift came from:

Customer type: incremental impact split into New and Returning customers, each with its own iROAS and incremental revenue. This tells you whether the channel is acquiring new customers or mostly driving repeat purchases.
Platform: incremental impact split by platform or channel included in the test.

Use these to decide not just whether to scale, but where the scalable value actually is. A channel that is strongly incremental for new customers is a different decision than one that mostly lifts returning buyers.

What a strong result looks like

A strong result has three things at once: a meaningful positive marketing contribution, a high probability of direction (close to 100%), and a credible interval that stays positive across its range. When all three line up, you can act on the iROAS with confidence. A directionally positive result with a lower probability of direction, or a credible interval that nearly touches zero, is still useful. It suggests the channel is driving lift, but you may want a follow-up test to tighten the estimate before making a large budget shift.

What an inconclusive result means, and what to do

A result is inconclusive when the credible interval is wide and crosses zero, or the probability of direction is not high enough to be sure the effect is positive. This does not mean the channel has zero impact. It means there was not enough signal to separate the effect from noise at this test design. When a test comes back inconclusive:

Review the test design. Did anything change during the window that could add noise, such as a promotion, a budget change, or a platform issue?
Check the power the test started with. If the feasibility report showed borderline power, an inconclusive result is not surprising.
Talk with your Triple Whale team about a longer or larger follow-up test, or whether Marketing Mix Modeling (MMM) can answer the question in the interim.

How to read a negative result

Occasionally test regions underperform the synthetic control, which suggests a negative incremental effect. This is rare, and it usually points to a measurement issue rather than truly harmful advertising. Common reasons:

Poor region matching: where control regions outperformed for reasons unrelated to advertising, such as a regional event or weather.
Cannibalization: where the channel shifts budget from other channels without adding new conversions.
Ad fatigue: where the channel is oversaturated in the test markets.

A negative result is worth investigating before you act. Talk with your Triple Whale team before pausing a channel based on a single negative test. For deeper diagnosis, see Troubleshooting incrementality tests.

When to use these results, and their limits

Use iROAS and marketing contribution to compare the true efficiency of channels and to guide budget decisions, especially where platform-reported ROAS looks inflated. Treat a single test as one strong signal, not the final word: the probability of direction and credible interval tell you how far to lean on it. For an always-on view of channel contribution between tests, pair incrementality with MMM. Incrementality is best for causal reads on specific channels over a defined window, and it is not a continuous daily reporting tool.

Reading Incrementality Test Results