Causal inference is the study of the effects of real or conceptual interventions on an outcome.
Randomized controlled trials
In medicine and science, causal inference is often approached by controlled studies. Controlled studies allow us to directly compare the results from assigning the intervention of interest to the results from a "control" intervention. For example, if we want to assess the effect of a new medication, we might compare the health outcomes for people given the medication and compare these with the health outcomes for people given a placebo. In order to avoid haphazard results, the medication and placebo groups must have a large enough number of people. Finally, in order to ensure that we are making a fair comparison between comparable groups, we will randomize subjects enrolled in the study into one of the two groups. This is called a randomized controlled trial (RCT) and is a staple of medical and pharmaceutical research.
Gaining knowledge through RCTs
Due to randomization and given a large enough study enrollment, the two study groups (medication and placebo) are fully comparable. For instance, they will have roughly the same number of sicker patients at baseline and the same number of people in each age group. After the medication/placebo is assigned, the only differences between the two groups should be due to the assignment, all other things having been made equal. Therefore, a direct comparison between the two groups' average health outcome measures of interest is a valid estimate of the effect of the medication vs. placebo.
The limitations of RCTs
Clinical trials are generally run on a restricted population which is clearly delimited in the study protocol. This is often done for ethical reasons. For instance, it is often not ethical to test a new medication on pregnant women or on children. In other cases, the study might focus on patient groups that are easier to access and less heterogeneous than the general population. Study participants are often only accepted into the study of a specific disease if they have no other chronic conditions. Some RCTs estimate the effect of taking a specific medication with the condition that other medication had not yet been taken for the same medical condition. In the general population, medications can be switched routinely, and are often used in conjunction with medications for other illnesses. Therefore, the effects estimated in an RCT, while locally valid, may not be entirely relevant to the effectiveness of medication in the general population and with actual usage patterns.
Observational studies
There are other ways that information about health and medication can be collected.
Estimating the effects of medication through observational study data
The complication involved in observational study data is that, because medication is not randomized, we cannot make a fair comparison between patients taking different medications. For instance, if we simply compare people taking a higher dosage of asthma medication versus a lower dosage, we will generally find that those taking the higher dosage have worse asthma symptoms. This is not due to the medication but rather to the fact that previously severe asthma led to a prescription for a higher dosage. Because of this, correlation is not causation.
Causal inference
In order to untangle the effects of medications (or any intervention) when randomization is imperfect or non-existent, statistical modeling is required. Beyond the statistical computations, we must also have a lot of subject matter knowledge about why medications are being prescribed and taken and what risk factors are involved in the outcomes of interest. Formal causal inference theory and methodology must be used in order to properly analyze the causal effect. Some traditional statistical methods are valid, but only in specific contexts (for instance, a difference in the mean of the outcomes will consistently estimate a causal effect on a specific population in a perfect RCT such that a t-test can be validly applied). I research new methodology that, under a list of clearly defined assumptions, will consistently estimate a causal effect from observational data.
Randomized controlled trials
In medicine and science, causal inference is often approached by controlled studies. Controlled studies allow us to directly compare the results from assigning the intervention of interest to the results from a "control" intervention. For example, if we want to assess the effect of a new medication, we might compare the health outcomes for people given the medication and compare these with the health outcomes for people given a placebo. In order to avoid haphazard results, the medication and placebo groups must have a large enough number of people. Finally, in order to ensure that we are making a fair comparison between comparable groups, we will randomize subjects enrolled in the study into one of the two groups. This is called a randomized controlled trial (RCT) and is a staple of medical and pharmaceutical research.
Gaining knowledge through RCTs
Due to randomization and given a large enough study enrollment, the two study groups (medication and placebo) are fully comparable. For instance, they will have roughly the same number of sicker patients at baseline and the same number of people in each age group. After the medication/placebo is assigned, the only differences between the two groups should be due to the assignment, all other things having been made equal. Therefore, a direct comparison between the two groups' average health outcome measures of interest is a valid estimate of the effect of the medication vs. placebo.
The limitations of RCTs
Clinical trials are generally run on a restricted population which is clearly delimited in the study protocol. This is often done for ethical reasons. For instance, it is often not ethical to test a new medication on pregnant women or on children. In other cases, the study might focus on patient groups that are easier to access and less heterogeneous than the general population. Study participants are often only accepted into the study of a specific disease if they have no other chronic conditions. Some RCTs estimate the effect of taking a specific medication with the condition that other medication had not yet been taken for the same medical condition. In the general population, medications can be switched routinely, and are often used in conjunction with medications for other illnesses. Therefore, the effects estimated in an RCT, while locally valid, may not be entirely relevant to the effectiveness of medication in the general population and with actual usage patterns.
Observational studies
There are other ways that information about health and medication can be collected.
- One such way is through a designed observational study. "Observational" refers to the fact that the study does not impose any intervention of the subjects -- it merely observes their medication usage and health status. The study might follow a selected cohort of people, sometimes with a specific disease or combination of diseases, and record their health information at regular intervals. The entrance criteria for such studies are generally much broader; since there is no experimental intervention, the ethical concerns are much reduced. The participants in these studies are therefore much more representative of the general population than in RCTs, and their medication usage reflects "real-life" usage.
- A second way for observational data to be collected is through electronic health records, which are automatically gathered when a patient uses a health service connected to a centralized system. In Québec, prescription information is collected through the Régie de l'assurance maladie du Québec (RAMQ) for people enrolled in the public plan. This data is less structured; for instance, information is collected when the medical services are used, not according to a study plan.
Estimating the effects of medication through observational study data
The complication involved in observational study data is that, because medication is not randomized, we cannot make a fair comparison between patients taking different medications. For instance, if we simply compare people taking a higher dosage of asthma medication versus a lower dosage, we will generally find that those taking the higher dosage have worse asthma symptoms. This is not due to the medication but rather to the fact that previously severe asthma led to a prescription for a higher dosage. Because of this, correlation is not causation.
Causal inference
In order to untangle the effects of medications (or any intervention) when randomization is imperfect or non-existent, statistical modeling is required. Beyond the statistical computations, we must also have a lot of subject matter knowledge about why medications are being prescribed and taken and what risk factors are involved in the outcomes of interest. Formal causal inference theory and methodology must be used in order to properly analyze the causal effect. Some traditional statistical methods are valid, but only in specific contexts (for instance, a difference in the mean of the outcomes will consistently estimate a causal effect on a specific population in a perfect RCT such that a t-test can be validly applied). I research new methodology that, under a list of clearly defined assumptions, will consistently estimate a causal effect from observational data.