Machine Learning Can Predict, But Can it Decide?

Introduction: The Philosopher's Question

Imagine you're a physician in the 18th century, observing that patients who receive a particular treatment tend to recover. You notice this pattern again and again, treatment and recovery seem to go hand in hand. Your mind begins to form an expectation: when I see treatment, I expect to see recovery.

This is precisely what the Scottish philosopher David Hume observed about human understanding. In his groundbreaking work, Hume argued that our belief in causation is fundamentally built on three pillars:

  1. Constant Conjunction: We observe events occurring together repeatedly
  2. Temporal Priority: The cause must precede the effect
  3. The Feeling of Expectation: When we see the cause, we psychologically expect the effect to follow

But here's the revolutionary insight that shook philosophy to its core: We can never directly observe causation itself. We only observe patterns, associations, correlations. The "feeling of expectation" that makes us believe in causation is, in Hume's view, a psychological habit, not a logical certainty.

Fast forward to the 21st century, and this philosophical puzzle has become a practical crisis in medicine, public policy, and data science. We have more data than ever before, but distinguishing true causes from mere correlations remains one of the most challenging problems in science.

The Biomedical Dilemma

Consider a scenario that plays out in hospitals and research labs every day:

A new drug is developed to treat a disease.

Early observations show that patients who receive better the drug have better recovery rates than those who don’t.

The correlation is strong, the statistical significance is clear. The drug appears to work.

But here's the catch:

What if patients with a certain genetic predisposition are both more likely to receive the drug (perhaps because their doctors recognise they're good candidates) and more likely to recover (because of their genetics)?

In this case, the drug might not be causing recovery at all—it might just be correlated with recovery through a hidden third factor: genetics.

Let's bring this abstract problem to life with a concrete example. We'll simulate a clinical scenario where:

  • Treatment (T): A new therapeutic drug
  • Recovery (R): Patient recovery outcome  
  • Genetic Predisposition (G): An underlying genetic factor

The critical question we need to answer:

Does the drug actually cause recovery, or is the association we observe merely a mirage created by genetic predisposition?

In reality, we would never know the true data-generating process. But for demonstration purposes, we'll simulate data where we know the truth: the drug does have a real causal effect, but it's partially masked by the confounding influence of genetics. This mirrors real-world situations where multiple factors interact in complex ways, and the idea is to untangle these relationships to discover what truly causes what.

When Hume wrote about "constant conjunction," he was describing what we now call correlation. We observe that certain events tend to occur together, and our minds naturally form expectations based on these patterns.

In our biomedical example, we observe strong correlations between:

  • Treatment and recovery
  • Genetics and recovery
  • Treatment and genetics
image-20251208-084215.png
Figure 1: Observed correlations between Treatment ↔︎ Recovery, Treatment ↔︎ Genetics and Genetics ↔︎ Recovery. Observation: Strong associations exist, but which are causal?

These correlations are real, they exist in the data. But here's where Hume's insight becomes crucial: observing that two things occur together doesn't tell us whether one causes the other, or whether they're both caused by something else entirely.

This is the fundamental challenge of causal inference. We can measure correlations with precision, but correlations alone cannot answer causal questions. We need something more, we need to understand the underlying causal structure, to control for confounders, and to think in terms of counterfactuals.

The numbers we see in Figure 1 above, the correlation coefficients, are seductive in their clarity. But as we'll discover, they can be misleading guides to understanding what truly causes what.

The Confounding Trap: When Good Models Go Bad

Imagine you're a data scientist tasked with evaluating this new drug. You have data on thousands of patients: some received treatment, some didn't. You want to know: Does the treatment work?

The most straightforward approach, what we might call "predictive modelling" or "naive analysis", would be to simply compare recovery rates between treated and untreated patients. Fit a model, get a coefficient, report the result. Done.

But this approach has a fatal flaw: it assumes that the only difference between treated and untreated patients is the treatment itself. In reality, treated and untreated patients might differ in many ways; age, genetics, socioeconomic status, disease severity, and countless other factors.

When these other factors (confounders) influence both treatment assignment and recovery, the naive model will give you a biased estimate. It will mix together:

  • The true causal effect of treatment
  • The spurious association created by the confounder

The result? You might conclude the drug works when it doesn't, or that it doesn't work when it does. In medicine, such mistakes can be literally a matter of life and death.

This is why we need causal models that explicitly account for confounders. By controlling for genetic predisposition (or other confounders), we can isolate the true treatment effect from the spurious associations.

The Naive vs. Correct Analysis

image-20251208-085354.png
Figure 2: Confounding Visualisation: When we control for genetics, the treatment effectbecomes clearer and more accurate across different genetic groups.

When we look at the data without controlling for genetics, we see one picture: treatment and recovery are strongly associated.

Figure 2 shows the outcome of a Naive Analysis (Predictive Modeling Approach), where

Estimated treatment effect: 0.238 which means Treatment increases recovery by 23.8%

But this picture is distorted by the confounding influence of genetics.

When we control for genetics, when we compare treated and untreated patients within the same genetic group, we see a different picture.

Figure 2 (right) shows the outcome of a Correct Analysis (Causal Modeling Approach), where

True treatment effect: 0.112 which means Treatment increases recovery by 11.2% after controlling for genetics

The treatment effect becomes clearer, more accurate. We're no longer comparing apples to oranges; we're comparing like to like.

This is the fundamental principle of causal inference: to identify causal effects, we need to make fair comparisons. We need to compare what would happen to the same person (or similar people) under different treatment conditions. Controlling for confounders helps us create these fair comparisons.

The visualisation shows this dramatically. On one side (Figure 2, left), we see the naive analysis, the confounded estimate that mixes treatment and genetic effects. On the other side (Figure 2, right), we see the causal analysis, the un-confounded estimate that isolates the true treatment effect by looking within genetic groups.

The difference between these two estimates is the confounding bias, the error we make when we ignore confounders. In real-world applications, this bias can be substantial, leading to incorrect conclusions and poor decisions.

Counterfactuals: The Ghosts of What Might Have Been

Here's a thought experiment that gets to the heart of causal inference:

Imagine a patient named Sarah. She received the treatment and recovered. The question: Did the treatment cause her recovery?

To answer this, we would need to know: What would have happened to Sarah if she had NOT received the treatment? If she would have recovered anyway, then the treatment didn't cause her recovery. If she would not have recovered, then the treatment did cause it.

But here's the problem: we can never observe this counterfactual. Sarah either received treatment or she didn't. We can't go back in time and run the experiment again with the opposite treatment. We can't observe both the factual outcome (what happened) and the counterfactual outcome (what would have happened) for the same person.

This is what statisticians call "the fundamental problem of causal inference." We can observe correlations, but we can never directly observe causation because causation requires comparing what happened to what would have happened—and we can only observe one of these.

Yet, somehow, we do make causal claims. We do believe that treatments cause recoveries, that smoking causes cancer, that education causes better outcomes. How is this possible?

The answer lies in causal inference methods. By making assumptions (about exchangeability, positivity, and consistency), by using randomisation, by controlling for confounders, and by leveraging natural experiments, we can estimate what we cannot observe. We can infer counterfactuals from factual data.

This is the magic of causal inference: it allows us to answer "what if" questions even though we can only observe "what is."

image-20251208-121108.png
Figure 3: Counterfactuals. Key Insight: We can never observe counterfactuals directly. This is the 'fundamental problem of causal inference'. Causal models help us estimate what we cannot observe.

Counterfactuals are, by definition, unobservable. But we can visualise them conceptually. In Figure 3, we show:

  • Factual outcomes: What actually happened (the data we can observe)
  • Counterfactual outcomes: What would have happened (what we must infer)

For treated patients, the counterfactual is: what if they hadn't been treated?
For untreated patients, the counterfactual is: what if they had been treated?

The difference between factual and counterfactual outcomes is the individual treatment effect, the causal effect of treatment for each specific person. This effect can vary from person to person (treatment effect heterogeneity), which is why some patients benefit more from treatment than others.

When we average these individual treatment effects across all patients, we get the average treatment effect (ATE), the overall causal effect of treatment in the population.

Average Treatment Effect: 0.207

The visualisation shows this dramatically. We see arrows connecting factual to counterfactual outcomes. The length of these arrows represents the treatment effect. Some arrows are long (large treatment effect), some are short (small treatment effect). This heterogeneity is real and important, it tells us that treatment doesn't affect everyone the same way.

Understanding counterfactuals is essential for causal reasoning. It's what allows us to move from "treatment and recovery are correlated" to "treatment causes recovery." It's the bridge between observation and causation.

Why We Need Causal Models: A Practical Imperative

Let's step back and ask a fundamental question: Why do we care about causation at all? Why isn't prediction enough?

The answer lies in what we want to do with our models, not just what we want to know.

The Limits of Prediction

Predictive models are incredibly powerful. They can forecast stock prices, predict customer behavior, identify fraud, and diagnose diseases. They answer the question: "What will happen?"

But prediction has limits:

  1. Association ≠ Causation: A predictive model might find that carrying matches is associated with lung cancer (because matches are correlated with smoking). But this doesn't mean matches cause cancer—it means the model found a pattern, not a cause.
  2. Confounding: Predictive models can't distinguish between correlation and causation. They'll happily use confounders to improve predictions, even if those confounders create spurious associations.
  3. Distribution Shift: Predictive models assume the future will look like the past. When distributions change (new patient populations, new contexts, new environments), predictions break down.
  4. No Interventions: Predictive models can't answer "what if" questions. They can't tell you what will happen if you change something, only what will happen if things stay the same.

The Power of Causation

Causal models answer a different question: "What will happen if I intervene?"

This is crucial because:

  1. Decision-Making: In medicine, we don't just want to predict outcomes—we want to change them. We need to know: "If I give this patient the treatment, will it help?" This is a causal question, not a predictive one.
  2. Understanding Mechanisms: Causal models help us understand why things happen, not just what will happen. This understanding is essential for science, policy, and practice.
  3. Robustness: Causal relationships are more stable than correlations. If A causes B, this relationship holds across different contexts, even when distributions change.
  4. Generalization: Causal knowledge transfers better than correlational knowledge. If we understand the causal mechanism, we can apply this knowledge in new situations.

A Concrete Example

Imagine a new patient arrives at the hospital. They have a genetic predisposition score of 0.5. Should we treat them?

A predictive model would say: "Based on past patients with similar characteristics, this patient has a 75% chance of recovery if treated and 60% if not treated. Therefore, treat them."

But this prediction might be confounded. The model might be using the correlation between treatment and recovery, which includes both the true treatment effect and the genetic effect. The recommendation might be wrong.

A causal model would say: "After controlling for this patient's genetics, the true causal effect of treatment is to increase recovery probability by 15 percentage points. Therefore, treat them."

The causal model's recommendation is based on understanding the true mechanism, not just patterns in the data. It's more reliable because it separates causation from correlation.

The Biomedical Imperative

In medicine, the stakes are high. Recommending the wrong treatment can harm patients. Approving ineffective drugs wastes resources and delays finding effective treatments. Missing effective treatments costs lives.

This is why causal inference isn't just an academic exercise—it's a practical necessity. We need to know not just what's correlated with what, but what causes what. We need to understand not just patterns, but mechanisms.

Hume's philosophical insight from 300 years ago remains profoundly relevant: we can only observe associations, but we need to understand causes. Causal inference is the bridge between what we can observe and what we need to know.

References and Further Reading

  • Hume, D. (1739). A Treatise of Human Nature
  • Pearl, J., & Mackenzie, D. (2018). The Book of Why: The New Science of Cause and Effect
  • Hernán, M. A., & Robins, J. M. (2020). Causal Inference: What If
  • Imbens, G. W., & Rubin, D. B. (2015). Causal Inference in Statistics, Social, and Biomedical Sciences

Blog Categories

Talk to our Data Expert
Thank you for reaching out!

Our team will get in touch with you over email within next 24-48hrs.
Oops! Something went wrong while submitting the form.

Watch the full Webinar

Blog Categories