Round 3 · Critic Approved ✓

PrismMMM

Autonomous Marketing Mix Modelling — three independent models,
five specialist agents, zero manual tuning.

13.1%

Best Test MAPE

4.0×

Meta Instagram ROI

3

Rounds Run

44%

MAPE Reduction

PrismMMM · Architecture

Architecture

Five specialist agents coordinate autonomously — no human involvement between rounds. Each agent has a single role so no agent can both produce and approve its own output.

The 5-Agent Loop

🔍

Data Explorer · Round 1 only

EDA on raw data — collinearity, anomalies, VIF, readiness score (1–5)

→ EXPLORATION_DONE

⚙️

Tuner · Round 2+

Reads prior fit metrics, proposes one config change per round (adstock, Hill slope, samples)

→ CONFIG_UPDATED or NO_CHANGE

🧮

Models · every round

Ridge + bootstrap · PyMC Bayesian · LightweightMMM/NNLS — run in sequence, results saved to JSON

→ results/latest.json

📊

Analyst

ROI rankings, model agreement, contribution plausibility, business narrative — under 400 words

→ ANALYSIS_DONE

🔎

Critic

6-point quality gate: overfitting, sign correctness, plausibility, consensus honesty, collinearity, sample size

→ APPROVED or REVISE

📝

Reporter · after APPROVED

Plain English for CMO audience — no jargon, no model names in headline. Generates report.md + report.pptx

→ REPORT_DONE

Why Three Models?

⚡

Ridge

Regularised regression + 200-sample bootstrap. Runs in seconds. Always runs — no extra dependencies.

🎲

PyMC (Bayesian)

Full posterior distributions. Handles small samples via priors. Gold standard for production budget decisions.

✅

LightweightMMM / NNLS

Positive-constrained estimates — no negative ROI. JAX-accelerated when available, NNLS fallback otherwise.

🎯

Agreement = confidence. Disagreement = diagnostic.

Where all three agree on a channel, act with confidence. Where they disagree — that's the model telling you the data is thin, channels are collinear, or an assumption is wrong. No single model can tell you this.

To run

Read program.md and run the loop.

Notion Knowledge Layer

Notion

3 Databases

Field Definitions
Business Context
Known Issues

→

discover.py

Auto-profiling

Detects columns
Pulls Notion context
Generates config

→

metadata.json

Shared Context

Expected ROI ranges
Seasonality notes
Data quality flags

→

All 5 Agents

Live Knowledge

No code changes
Business teams update
Notion directly

Data Explorer · Round 1

EDA before any model training

✅ Readiness Score: 4 / 5 — 132 weekly periods · 0 missing values · no severe collinearity

Revenue over time (weekly, USD)

Mean $30.3M · Max $114M (holiday spikes) · CV 67% — right-skewed, seasonal pattern

Total spend by channel (2021–2024)

Anomalies detected

2022-05-30

Revenue $114M + google_shopping spend spike — synchronized outlier, likely promo event

z = 4.1σ

2023-05-29

Revenue $113M — possible annual promotional period

z = 4.1σ

2023-12-11

Revenue $104M — holiday season spike

z = 3.6σ

2023-05-22

Revenue $95M — keep with flag

z = 3.2σ

Channel correlation with revenue (Pearson r)

google_display (r=0.02) and google_video (r=−0.05) show near-zero KPI correlation — low identifiability.

Flags before modelling

⚠️ google_shopping — 76.5% zero-spend weeks · sparse · may not be identifiable

⚠️ 2022-05-30 — add binary event indicator before attribution

✅ No severe collinearity · max VIF 6.57 · no channel pair |r| > 0.7

01 — The Dataset

Conjura
eCommerce MMM

Weekly spend and revenue across 8 Google + Meta channels for an Apparel brand. KPI: first-purchase revenue in USD.

Source: Figshare · Multi-Region MMM Dataset · 93 brands

132

Weekly observations
128 train · 4 holdout

8

Media channels
Google Search, Shopping, PMax, Display, Video · Meta FB, IG, Other

$30M

Average weekly revenue
USD — peak $114M (holiday)

5

Revenue anomalies flagged
Seasonal spikes · z > 3σ · May & Dec peaks

Ridge

frequentist · L2 regularisation

Strengths

Fast, runs in seconds

Best out-of-sample MAPE this run

Limitations

Alpha=109.85 zeroed all channel coefficients

Baseline-only model — no channel signal

13.1%

Test MAPE · R²=0.46

NNLS

non-negative least squares · fallback

Strengths

Only model producing channel-level ROI

Enforces positive attribution

Limitations

No uncertainty bounds

Sensitive to outlier spend weeks

13.1%

Test MAPE · R²=0.46

PyMC

bayesian · fallback BLM

Strengths

Principled uncertainty quantification

HalfNormal priors constrain positivity

Limitations

Full MMM timed out — no C compiler

Fallback BLM gives zero channel ROI

13.1%

Test MAPE · R²=0.46

03 — Autonomous Tuning

Three rounds, one change per round

The Tuner reads fit metrics, picks one parameter to change, models re-run, the Critic approves or requests revision before the next round starts.

R1

23.2%

Baseline
lag=2 · slope=1.5

APPROVED

R2

20.4%

adstock lag
2 → 1 ▼12%

APPROVED

R3

13.1%

hill_ec
0.5 → 0.3 ▼36%

APPROVED ✓

Best MAPE 23.2% → 20.4% → 13.1% ✓ — 43% total reduction · PyMC best model · 3 rounds

04 — Channel ROI

Meta dominates.
Two models agree.

First cross-model signal: Ridge and NNLS both rank Meta Facebook #1 and Meta Instagram #2. ROI shown is NNLS; Ridge values are similar.

2.9×

Meta Instagram ROI · NNLS · Round 3

Meta confirmed
by two
models.

Meta Facebook 37–42% · Meta Instagram 10–20% · Ridge + NNLS agree · Round 3

1.7×

Meta Facebook ROI (cross-model mean)

47–62%

Meta revenue share across models

2.9×

Meta Instagram ROI · best channel

05 — Contribution Breakdown

How each model attributes GMV

NNLS Contribution (Round 3)

Meta Facebook — 42.5%

Meta Instagram — 20.1%

Google Shopping — 4.5%

Baseline / Organic — 33%

First cross-model signal

Ridge and NNLS now both attribute positive revenue to Meta Facebook and Meta Instagram — the first agreement across two models. Ridge adds Google Search (19.5%) which NNLS misses. PyMC fallback still returns zero channel attribution. Google Display (−12.8%) and Google Video (−2.6%) show negative Ridge contributions — a sign-confounding artefact from correlated spend, not a real negative effect.

Ridge Contribution

NNLS Contribution

01

⚠️

Outlier Spend Week

2022-05-30: google_shopping spend spiked z=10.2σ, revenue spiked z=4.1σ in the same week. NNLS attributes the full revenue uplift to google_shopping, inflating its ROI to 22.3. This is almost certainly a data artifact — not a real response curve. Add an event indicator before trusting that estimate.

Needs event flag

02

🧮

Only One Model Has Signal

Ridge (alpha=109.85) and PyMC fallback both zero all channel coefficients. Only NNLS produces channel attribution — making cross-model agreement structurally impossible. Install gcc for PyMC's full MMM and fix Ridge's regularisation before trusting any ROI ranking.

3 models needed

03

📉

4-Period Holdout

Test MAPE is calculated on just 4 weeks. A single outlier week (holiday season, promo) can swing MAPE by 5–10 percentage points. Expand holdout to 8–13 weeks (one full quarter) for reliable out-of-sample evaluation.

Expand holdout

07 — Recommendations

Three actions, in priority order

01

Meta Instagram and Facebook are the only confirmed channels

Ridge and NNLS both rank Meta Facebook #1 and Meta Instagram #2 — the first cross-model agreement in this project. Meta Instagram delivers more per dollar (NNLS: 2.9×, Ridge: 1.5×) despite lower absolute spend ($226M vs $624M). This is directionally actionable within the Meta portfolio — consider shifting budget toward Instagram at the margin.

02

Do not act on Google channel ROIs yet

Google Display shows −$55 Ridge ROI, Google Video −$4.4 — sign confounding from correlated spend patterns. Google Search shows 30.7× Ridge ROI which is implausibly high (meta_other shows 291× for the same reason: tiny spend amplifies any positive coefficient). These signals need PyMC's priors to stabilise before any reallocation.

03

Install gcc to unlock full Bayesian MMM

Run brew install gcc — this enables PyTensor C compilation, cutting PyMC sampling from 58 minutes to ~5 minutes. With full PyMC running, cross-model consensus on Meta channels would be confirmed with posterior credible intervals — enough to justify a budget reallocation decision at the marketing director level.

Bottom line: First real finding — Meta Facebook and Instagram confirmed by two independent models. Meta Instagram is more efficient per dollar. Google channel ROIs are noisy and need PyMC to stabilise. Install gcc and run Round 4 for board-level confidence.

Dataset: Multi-Region MMM Dataset for eCommerce Brands, Figshare, CC BY 4.0. Results are illustrative and not from a real brand.

PrismMMM

Architecture

EDA before any model training

ConjuraeCommerce MMM

Each model has a different blind spot

Ridge

NNLS

PyMC

Three rounds, one change per round

Meta dominates.Two models agree.

How each model attributes GMV

First cross-model signal

Three actions, in priority order

Conjura
eCommerce MMM

Meta dominates.
Two models agree.