| |||||||||||||||||||||||||||||||||||||||||||||||||||||||
15 October 1997 | Volume 127 Issue 8 Part 2 | Pages 757-763
The aim of many analyses of large databases is to draw causal inferences about the effects of actions, treatments, or interventions. Examples include the effects of various options available to a physician for treating a particular patient, the relative efficacies of various health care providers, and the consequences of implementing a new national health care policy. A complication of using large databases to achieve such aims is that their data are almost always observational rather than experimental. That is, the data in most large data sets are not based on the results of carefully conducted randomized clinical trials, but rather represent data collected through the observation of systems as they operate in normal practice without any interventions implemented by randomized assignment rules. Such data are relatively inexpensive to obtain, however, and often do represent the spectrum of medical practice better than the settings of randomized experiments. Consequently, it is sensible to try to estimate the effects of treatments from such large data sets, even if only to help design a new randomized experiment or shed light on the generalizability of results from existing randomized experiments. However, standard methods of analysis using available statistical software (such as linear or logistic regression) can be deceptive for these objectives because they provide no warnings about their propriety. Propensity score methods are more reliable tools for addressing such objectives because the assumptions needed to make their answers appropriate are more assessable and transparent to the investigator.
Author and Article Information
From Harvard University, Cambridge, Massachusetts.
STATISTICAL METHODS
Estimating Causal Effects from Large Data Sets Using Propensity Scores
![]()
Note: This article is one of a series of articles comprising an Annals of Internal Medicine supplement entitled "Measuring Quality, Outcomes, and Cost of Care Using Large Databases: The Sixth Regenstrief Conference." To see a complete list of the articles included in this supplement, please view its Table of Contents.
Grant Support: In part by a grant from the National Science Foundation (SES-9207456).
Acknowledgments: The author thanks Jennifer Hill and Frederick Mosteller for helpful editorial comments on an earlier draft of this article.
Requests for Reprints: Donald B. Rubin, PhD, Harvard University, Department of Statistics, Science Center, 6th Floor, 1 Oxford Street, Cambridge, MA 02138.
This article has been cited by other articles:
![]() |
A. W. Chan, D. L. Bhatt, D. P. Chew, M. J. Quinn, D. J. Moliterno, E. J. Topol, and S. G. Ellis Early and Sustained Survival Benefit Associated With Statin Therapy at the Time of Percutaneous Coronary Intervention Circulation, February 12, 2002; 105(6): 691 - 696. [Abstract] [Full Text] [PDF] |
||||