Back to the Future: Clinical Vignettes and the Measurement of Physician Performance
- John Norcini, PhD
- From Foundation for Advancement of International Medical Education and Research, Philadelphia, PA 19104.
The past 2 decades have seen many efforts to improve the quality of health care. These efforts have relied on a series of methods devised by workers in the field of quality management science and, in some cases, used successfully in industry for more than 50 years (1, 2).
Measuring performance is central to the quality management sciences. The principal measures in health care are patient outcomes and the process of care that physicians provide in practice. These measures identify areas that need improvement, signal the accomplishment of goals, and respond to the need for accountability (3). Unfortunately, methods of assessing physicians' practice performance are in their infancy, and they face unresolved logistic and psychometric challenges. For instance, comparing physicians on the basis of their daily work is difficult because they see patients who have different conditions (case mix) and, even when the conditions are the same, patients' severity of illness and comorbid conditions differ. Furthermore, attributing patient outcomes solely to individual physicians may be inappropriate because care is often rendered by teams (4).
In this issue, Peabody and colleagues (5) test the quality of practice by using clinical vignettes. Since all physicians take the same vignettes under secure conditions, problems of case mix, severity of illness, and attribution are eliminated (5). This elegant, multisite study compared physician performance by using 3 sources of data on physicians' actions and decisions: what physicians did while caring for standardized patients (the “gold standard”), what physicians wrote in the medical records of those visits (the usual source of information for evaluating quality), and what actions physicians took while using computer-based clinical vignettes of matched patients (the proposed new method for evaluating quality). With identical quality criteria applied, differences in performance using data from the 3 measures were statistically significant but small. The average percentage correct on the vignettes was 5% lower than the average score on the standardized patients and 5% higher than the average score on the medical records. In addition, the medical records and vignettes were similar in capturing unnecessary care. These findings led the authors to conclude that clinical vignettes can be a useful tool in measuring clinical quality.
This work builds on earlier research comparing performance on clinical vignettes and actual patient care. Years ago, researchers retrospectively compared computer-based clinical vignettes with the real medical records of patients with similar medical problems and found no differences in the extent to which the physicians met quality-of-care criteria (6, 7). Peabody and colleagues took this work to the next step by prospectively comparing practice performance with standardized patients against identical cases administered in computer-based form. Although the average proportion of correct actions taken while using the vignettes was within 5% of the standardized patient and the medical record, averages alone aren't enough. Improvement in the quality of patient care necessitates specific feedback to physicians about what they are doing. Basing that feedback on clinical vignettes requires demonstration that physicians chose identical responses with each assessment method. Undoubtedly, the authors will test this in future secondary data analyses.
A single study, even one as good as Peabody and colleagues' study, does not definitively answer the questions about whether a measure of physician performance actually does what it purports to do. Other investigators need to perform validation studies of the clinical vignette method so that a body of evidence about its accuracy is created (8). Fortunately, this study of the validity of clinical vignettes is not the first of its kind. In addition to the work cited by the authors, similar methods have been available in paper- and computer-based form for more than 40 years. Consequently, a substantial body of research supports the use of the clinical vignette method and helps us to put the current results into a larger context (9-12).
The scoring of clinical vignettes is one issue, relevant to the study by Peabody and colleagues, that the previous research addresses. Peabody and colleagues defined good-quality clinical practice as the comprehensive provision of services, which led them to use many criteria to evaluate the care of each patient. Perhaps as a consequence, few, if any, study participants achieved perfect scores. In fact, on average, they missed more than a quarter of the criteria, even on the uncomplicated cases. Does this outcome mean that the quality of care provided by the physicians in the study was seriously deficient? More likely, it indicates that some criteria produced redundant information while others were important, but not essential, especially in the context of a first visit. This explanation is consistent with the literature on the scoring of clinical vignettes (13, 14). Simply tallying the criteria, when there are many of them, tends to produce scores that reflect thoroughness rather than competence.
As the authors point out, a great strength of the clinical vignette method is that it provides a uniform stimulus to all physicians, so that responses are comparable over sites and completely attributable to the individual physician. Presenting the same cases to all physicians eliminates variability due to other health care providers, patient complications, severity of illness, and the like. On the other hand, it also makes it impossible to evaluate physician attributes that are directly related to competence, such as teamwork, the ability to gain patient adherence, and systems-based practice. More significantly, the use of vignettes to evaluate physicians would probably not detect any influences of the system of care itself, which is often a more appropriate focus of quality improvement than the individual physician.
Finally, and most important, the motivation of the participating physicians will be critical to the usefulness of clinical vignettes in quality improvement. The ability to perform well on clinical vignettes is a necessary but not sufficient condition for good performance in practice. Physicians may be tempted to respond to clinical vignettes in an ideal fashion that may differ from their usual clinical performance, especially if they are penalized for not doing well. Therefore, good performance on a clinical vignette doesn't guarantee high-quality day-to-day performance. However, poor performance on clinical vignettes should identify physicians who do not know the ideal approach to a patient because of a knowledge deficit. Therefore, clinical vignettes might be most useful as a complement to work-based methods for efficiently and effectively identifying knowledge deficits that need correction.
In summary, Peabody and colleagues take us back to the future. Their work builds on an extensive body of research on clinical vignettes and moves the work forward by demonstrating that performance on clinical vignettes is linked with actual practice performance on matched cases. Secondary analyses will help us better understand how clinical vignettes fit into quality management, but the motivations of the physicians taking them will ultimately be more important.
John Norcini, PhD
Foundation for Advancement of International Medical Education and Research; Philadelphia, PA 19104
Article and Author Information
-
Potential Financial Conflicts of Interest: None disclosed.
-
Requests for Single Reprints: John Norcini, PhD, Foundation for Advancement of International Medical Education and Research, 3624 Market Street, Philadelphia, PA 19104; e-mail, jnorcini{at}faimer.org.
RSS Feeds









