Causal Discovery

Lifesight’s approach to Causal Graph–powered discovery

At Lifesight, the structure of the Causal DAG does not start with an algorithm — it starts with deep domain understanding. This principle is also central to the Pearlian framework: causal structure should be guided by real-world knowledge, not just statistical patterns.

Once this initial structure is defined, Lifesight applies a two-step process:

Step 1: Causal Discovery Step 2: Causal Effect Estimation

[Note : ML Based inference, which is the second pillar of MMM framework runs parallel to the Step 1. The Direct effects estimated in this step is then back-propagated using causal adjustment multiplier from Step 2 ]


Statistical Paradoxes and the Intuition behind the need for Causal Graph

Before diving into thee steps of Causal Discovery & Estimation , it is essential to build the right intuition about why causal structure matters in the first place. The fastest way to build this intuition is by understanding two classic causal errors that regularly mislead marketers, analysts, and even advanced models:

  • Simpson’s Paradox
  • Berkson’s Paradox

These paradoxes demonstrate how incorrect conditioning or aggregation can completely reverse or fabricate relationships between variables.

Causal Intuition: Simpson’s Paradox vs Berkson’s Paradox

These two paradoxes explain why naïve regression or correlation analysis fails - and why causal graphs are necessary.

Simpson’s Paradox – The Hidden Confounder Problem

Reference 1 - https://brilliant.org/wiki/simpsons-paradox/
Reference 2 - https://en.wikipedia.org/wiki/Simpson%27s_paradox

Simpson's paradox is a phenomenon in probability and statistics in which a trend appears in several groups of data but disappears or reverses when the groups are combined.

Most errors in marketing measurement are not due to weak algorithms, but due to incorrect causal assumptions.

Structure: A hidden variable (Z) influences both X and Y


Fast Causal Inference (FCI) Algorithm

Below mentioned is a broad skeleton of Causal Graph (In the actual causal graph, each of these nodes will be exploded to its specific)

Example : Brand Tactics will be expanded to TV, OOH, Prospecting Tactics across Meta, Google Display, Youtube, TikTok etc... Performance Tactics will be expanded to Search, Retargetting etc...

We then apply the approach know as Fast Causal Inference , which is a very popular variant of PC algorithm, but unlike PC algorithm Fast Causal Inference assumes unobserved variables in the DAG.

FCI = PC + Real-World Awareness

PhasePC algorithmFCI algorithm
SkeletonUses conditional independence tests to remove edgesSame as PC but more conservative
Separating setsStores sepset(A, B)Same but used later more aggressively
Collider detectionIdentifies A → B ← CSame but with extra latent checks
OrientationUses Meek’s orientation rulesUses extended orientation rules for latent confounders
Extra searchPerforms additional conditional tests on subsets (Possible D-SEP)
OutputCPDAGPAG

FCI = PC + extra conditional tests + conservative orientation


FCI Methodology


PhaseStepWhat happens (Methodology)Key OutputWhat this gives you in MMM / Lifesight
1. Problem setup1.1 Define variablesSelect observed variables (media, revenue, price, seasonality, events, etc.)Variable set VEnsures all measurable drivers are accounted
1.2 Draw prior DAGCreate expert-driven initial DAG (allow partially directed edges)Prior DAGEncodes business knowledge & causal beliefs
1.3 Define constraintsLock impossible edges (e.g., Revenue → Spend)Constraint matrixPrevents nonsensical directions
2. Data prep2.1 Time alignmentAlign all variables to same granularity & lagsClean datasetPrevents spurious causal detection
2.2 NormalizationScale / transform variables (log, z-score, adstock)Model-ready dataImproves CI test reliability
2.3 Stationarity checksTest for drift / non-stationarity (ADF, KPSS)Stationary seriesRequired for accurate inference
3. Skeleton discovery3.1 Fully connected graphStart with complete undirected graphG₀No assumptions removed yet
3.2 Cond. independence testsRemove edges A–B if A ⫫ BSSkeleton graph + Sepsets
4. Possible-D-SEP expansion4.1 Compute Possible-D-SEPFind nodes that could d-separate A and B with hidden confoundersExpanded conditioning setsSimulates invisible variables
4.2 Extra CI testsRetest adjacencies with PDSRevised skeletonRemoves false edges missed by PC
5. Orientation (colliders)5.1 Find v-structuresIf A and C not in Sepset with B → mark A → B ← CInitial directionsDetects mediators
5.2 Check against PDSRevalidate with latent-aware logicRefined v-structuresAvoids false collider claims
6. Orientation (rules)6.1 Apply extended Meek rulesOrient where logically forcedDirected edgesPrevents cycles
6.2 Partial orientationUse ambiguous tails (o→, o-o)PAG graphEncodes uncertainty explicitly
7. Selection bias handling7.1 Detect S-biasIdentify possible selection variablesS-flagged edgesExplains biased correlations
8. Causal strength estimation8.1 Identify identifiable edgesKeep only directed / semi-directed pathsEstimable subgraphSafe-to-quantify relationships
8.2 Choose estimatorRegression / SEM / SCM / DoWhy / PyMCEstimation engineTranslates structure → effect
8.3 Estimate total effectCompute ATE / direct / indirect effectsEffect size“Spend → Revenue” impact
8.4 Estimate uncertaintyBootstrap / MCMC / Bayesian HDIConfidence / Credible intervalsModel reliability
9. Strength scoring9.1 Edge stabilityRun FCI across subsamplesStability %Robustness metric
9.2 Orientation confidence% of times same arrow appearsDirection probabilityTrust in direction
9.3 CI test strengthp-value or partial corrDependency strengthRelationship reliability
9.4 Composite causal scoreWeighted score: (stability + direction + p)0–1 causal strength indexRank channels by causal validity
10. Validation & interpretation10.1 Compare to experimentsCheck aligned with lift tests / geo testsCoherence scoreGround truth alignment
10.2 Causal vs predictive gapCompare vs MMM regressionDivergence measureIdentify overfitting or bias
10.3 Final decisionFlag strong vs weak assumptionsCausal readiness indexDeployment readiness


Example of the Output

RelationshipPAG edgeStabilityDirection confidenceAvg p-valueCausal Strength Score
Meta → Revenue0.910.890.0020.90 (Strong)
TV ↔ Revenue0.630.100.040.41 (Weak)
Search → Revenue0.950.930.0010.94 (Very strong)
Influencer o→ Searcho→0.550.450.070.50 (Weak)
Email → Revenue0.880.900.010.89 (Strong)

Strong causal strengths are then used in the Contribution Back Propagation approach as mentioned here