{ "cells": [ { "cell_type": "markdown", "id": "fef8da1c", "metadata": {}, "source": [ "(mmm_intro_upper_funnel)=\n", "# Introduction: Measuring Upper-Funnel Impact with PyMC‑Marketing\n", "\n", "In this notebook, we will walk through a simple example of how to use PyMC-Marketing to measure the impact of upper-funnel channels on a downstream business outcome. A very detailed and advanced version of this notebook is available in {ref}`mmm_upper_funnel_causal_approach`.\n" ] }, { "cell_type": "markdown", "id": "2cdeccb8", "metadata": {}, "source": [ "## Business Challenge\n", "\n", "Upper-funnel marketing investments (awareness campaigns, video ads, influencer partnerships) rarely drive immediate conversions. Leadership often asks: *\"Are our upper-funnel dollars doing anything?\"* Standard dashboards show weak direct correlations with sales, making these channels appear ineffective.\n", "\n", "The challenge is that **upper-funnel channels work indirectly**. They build awareness that flows through mid-funnel engagement and eventually reaches high-intent channels (like brand search) that directly drive outcomes. Ignoring this mediation structure either under-credits upper-funnel impact or incorrectly attributes all lift to the final touchpoint.\n", "\n", "**This notebook shows how to:**\n", "- Respect the causal structure of the marketing funnel\n", "- Measure indirect effects through mediation modeling\n", "- Generate counterfactual predictions to quantify upper-funnel impact\n", "\n", "For a detailed treatment including data generation and theoretical foundations, see {ref}`mmm_upper_funnel_causal_approach`.\n", "\n", "```{tip}\n", "This approach is very similar to the one presented in the blog post [Bayesian Media Mix Modeling using PyMC3, for Fun and Profit](https://engineering.hellofresh.com/bayesian-media-mix-modeling-using-pymc3-for-fun-and-profit-2bd4667504e6) by [HelloFresh](https://engineering.hellofresh.com/).\n", "```\n", "```{note}\n", "Funnel effects in marketing mix modeling are well-known, but not clearly explained how to tackle them with causal inference. Here, we want to bridge this gap.\n", "\n", "For instance, already in the well-known paper [Challenges and Opportunities in Media Mix Modeling](https://research.google/pubs/challenges-and-opportunities-in-media-mix-modeling/), by D. Chan, et al., (Google) they mention:\n", "\n", "*Selection bias arises when there are funnel effects in the media and the model is mis-specified. When\n", "an ad channel also impacts the level of another ad channel\n", "which simultaneously estimate the impact of all ad channels in one equation, will lead to biased\n", "estimates. An example is a TV campaign driving more related queries, which in turn increase\n", "volume of paid search ads. For a derivation of the size of the bias, see Angrist and Krueger (1999).\n", "The underlying reason for the bias in the mis-specified model is that the downstream ads were\n", "affected by the TV ads. In assessing the ROAS of TV, the linear regression model does not account\n", "for the changes in paid search ads caused by TV.\n", "Downstream ads should not be included with exogenously-determined ads in a single regression\n", "equation. Alternatives include graphical models (Pearl (2009)) and structural equation models.\n", "However, the problem isn’t only a matter of model form. Both these alternatives require estimating\n", "the causal effect of the upstream ads on the downstream ads, which has just as stringent data\n", "requirements as estimating the effect of an ad channel on sales.*\n", "```" ] }, { "cell_type": "markdown", "id": "e84a0ca0", "metadata": {}, "source": [ "## Prepare Notebook" ] }, { "cell_type": "code", "execution_count": 1, "id": "5781ca23", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/juan.orduz/.local/share/mamba/envs/pymc-marketing-env/lib/python3.13/site-packages/pymc_extras/model/marginal/graph_analysis.py:10: FutureWarning: `pytensor.graph.basic.io_toposort` was moved to `pytensor.graph.traversal.io_toposort`. Calling it from the old location will fail in a future release.\n", " from pytensor.graph.basic import io_toposort\n", "/Users/juan.orduz/Documents/pymc-marketing/pymc_marketing/pytensor_utils.py:34: FutureWarning: `pytensor.graph.basic.ancestors` was moved to `pytensor.graph.traversal.ancestors`. Calling it from the old location will fail in a future release.\n", " from pytensor.graph.basic import ancestors\n", "/Users/juan.orduz/Documents/pymc-marketing/pymc_marketing/mmm/multidimensional.py:216: FutureWarning: This functionality is experimental and subject to change. If you encounter any issues or have suggestions, please raise them at: https://github.com/pymc-labs/pymc-marketing/issues/new\n", " warnings.warn(warning_msg, FutureWarning, stacklevel=1)\n" ] } ], "source": [ "import arviz as az\n", "import graphviz\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "import seaborn as sns\n", "from pymc_extras.prior import Censored, Prior\n", "\n", "from pymc_marketing.mmm import (\n", " GeometricAdstock,\n", " MichaelisMentenSaturation,\n", " NoAdstock,\n", " NoSaturation,\n", ")\n", "from pymc_marketing.mmm.multidimensional import MMM\n", "from pymc_marketing.paths import data_dir\n", "\n", "az.style.use(\"arviz-darkgrid\")\n", "plt.rcParams[\"figure.figsize\"] = [12, 7]\n", "plt.rcParams[\"figure.dpi\"] = 100\n", "plt.rcParams[\"figure.facecolor\"] = \"white\"\n", "\n", "SEED = 142\n", "rng = np.random.default_rng(SEED)\n", "\n", "%load_ext autoreload\n", "%autoreload 2\n", "%config InlineBackend.figure_format = \"retina\"" ] }, { "cell_type": "markdown", "id": "2f428b90", "metadata": {}, "source": [ "## Read Data\n", "\n", "Let's consider a simple example where we have a downstream business outcome (new users) and a set of upper-funnel channels (`impressions_x1`, `impressions_x2`, `impressions_x3`, `impressions_x4`) (this synthetic data is generated in the advanced version of this notebook {ref}`mmm_upper_funnel_causal_approach`)." ] }, { "cell_type": "code", "execution_count": 2, "id": "cd442407", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | date | \n", "target_var | \n", "impressions_x1 | \n", "impressions_x2 | \n", "impressions_x3 | \n", "impressions_x4 | \n", "event_2020_09 | \n", "event_2020_12 | \n", "event_2021_09 | \n", "event_2021_12 | \n", "event_2022_09 | \n", "trend | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "2022-01-01 | \n", "0.4966 | \n", "0.1577 | \n", "0.1194 | \n", "0.2208 | \n", "0.1134 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0 | \n", "
| 1 | \n", "2022-01-02 | \n", "0.5341 | \n", "0.1394 | \n", "0.1169 | \n", "0.2263 | \n", "0.1157 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1 | \n", "
| 2 | \n", "2022-01-03 | \n", "0.5659 | \n", "0.1712 | \n", "0.1177 | \n", "0.2268 | \n", "0.1240 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "2 | \n", "
| 3 | \n", "2022-01-04 | \n", "0.5761 | \n", "0.1175 | \n", "0.1163 | \n", "0.2247 | \n", "0.1221 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "3 | \n", "
| 4 | \n", "2022-01-05 | \n", "0.5679 | \n", "0.0927 | \n", "0.1177 | \n", "0.2209 | \n", "0.1156 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "4 | \n", "