Rapid Responses to:
|
|
Electronic letters published:
|
|
|||
|
Milo A. Puhan, MD Horten Centre, University Hospital of Zurich, Switzerland, Johann Steurer, and Gerben ter Riet
Send rapid response to journal:
milo.puhan{at}evimed.ch Milo A. Puhan, et al.
|
IN RESPONSE: Drs. Broce and Reyes state that it remains unclear how the surveyed physicians derived posttest probabilities. They wondered whether we provided the relevant equations and whether we recorded if participants calculated or guessed posttest probabilities. In fact, we did neither. Thus we could not determine how the physicians arrived at their posttest probabilities. However, our assumption based on experience and research, was that the vast majority of physicians do not formally calculate posttest probabilities but use quantitative information about a test’s informativeness in an inexact way.(1;2) Along the same line, we developed the inexact numerical graphical format. The setting of our trial, a lecture hall at a continuous medical education conference, was not conducive to study the physicians’ cognitive processes. However, we would welcome any studies investigating physicians’ cognitive processes when they are confronted with quantitative information about a test’s informativeness. Dr. Brotman argues that our study did not test if at extreme combinations of sensitivities and specificities, for example a sensitivity of 0.97 and a specificity of 0.03, the likelihood ratio (equal to 1) was the superior measure of association. His hypothesis might be correct, but in commonly encountered diagnostic situations these extreme values of sensitivity and specificity are rare. We decided to present vignettes of more common clinical scenarios. Nevertheless, in our vignettes 1 and 4, which are closest to the situation that Dr. Brotman had preferred (sensitivity-specificity combinations of 0.93, 0.45 (LR=1.7), and 0.40, 0.79 (LR=0.8), respectively), the differences between the 2 numerical formats on posttest probability estimates were negligible. We agree that relevant experts should be involved in the design of survey instruments before administering them. Therefore we pilot tested and revised our vignettes with the help of 21 internists. We cannot exclude that a more sophisticated development process might have resulted in better vignettes. We agree that the vignettes’ test-retest reliability could, and perhaps should, have been tested before use. However, we are not sure how the validity of the vignettes might be assessed beyond face validity through the eyes of experienced clinicians. We welcome the suggestions for further studies aimed at refuting our findings while taking on board additional methodological aspects as pointed out by Drs. Broce, Reyes, and Brotman. Or in the spirit of Karl Popper: design carefully, aim to refute in order to be able to corroborate convincingly. For anyone interested, a copy of our questionnaire is available from the corresponding author. From the University Hospital of Zurich, Horten Centre, University Hospital, Postfach Nord, CH-8091 Zurich, Switzerland and the Department of General Practice at the Academic Medical Center, Amsterdam, The Netherlands. Potential Financial Conflicts of Interest: None disclosed. References (1) Reid MC, Lane DA, Feinstein AR. Academic calculations versus clinical judgments: practicing physicians' use of quantitative measures of test accuracy. Am J Med. 1998;104:374-80. (2) Steurer J, Fischer JE, Bachmann LM, Koller M, ter Riet G. Communicating accuracy of tests to general practitioners: a controlled study. BMJ. 2002;324:824-26. Conflict of Interest:None declared |
|||
|
|
|||
|
Mike Broce, BS CAMC Institute, Bernardo Reyes MD CCI
Send rapid response to journal:
bernardo.reyes{at}camc.org Mike Broce, et al.
|
We reviewed with interest the work published by Dr. Putan and his group in the August 2, 2005 issue of this journal. We agree that it is essential for physicians to interpret the real value of diagnostic testing to confirm clinical suspicions for a better practice of medicine to occur. After reading their conclusions, we believe that replication of their results is needed, perhaps considering our suggestions. These suggestions are not intended to diminish the findings of what we consider is an excellent study. First, although a table for the clinical vignettes was provided, it was not clear if equations to calculate illness probability changes were provided to surveyed physicians. Perhaps physicians are less likely to remember complex equations not commonly used in clinical practice. If equations were not available to the physicians, then the authors could have been testing knowledge and recall of bio-statistical methods rather than the ability to calculate post-test probability. Next, even though the researchers were able to determine if actual calculations had been made, we speculate that the authors were not able to determine the reason(s) for not providing the correct post-test probability. Was it because the physicians simply did not know how to do the calculations (therefore they guessed the answer), or was it because they did not agree with the logic of the diagnostic testing? If the latter is true, then it is likely that the physicians based their answers on what they think the post-test probability would be regardless of the testing. Regarding the survey instrument, we postulate to avoid mixing test results with the findings of physical exams or medical histories. The aim of this suggestion is simply to avoid confusing scenarios that could possibly influence the results of any post-test probability calculations. Furthermore, in order to reduce unexplained errors, we recommend selecting a team of medical experts (familiar with the medical conditions of interest) to help design and validate the instrument before implementation. Along these same lines, after survey construction, it could be wise to test the validity and reliability of the instrument before administering it to a survey group. Thus, any conclusions or generalizations about research findings would be sound. Finally, for researchers who wish to replicate this or a similarly designed study, detailed information about the methods and procedures, especially participant instructions and a copy of the actual survey instrument included in the manuscript would be most beneficial. Conflict of Interest:None declared |
|||
|
|
|||
|
Daniel J. Brotman, M.D. Johns Hopkins Hospital
Send rapid response to journal:
dbrotma1{at}jhmi.edu Daniel J. Brotman
|
Puhan and colleagues deserve praise for their creative assessment of how well physicians interpret diagnostic test results, but I disagree with their conclusion that likelihood ratios are no more informative than sensitivity and specificity. The authors asked clinicians to estimate the post-test probabilities of various conditions based upon pre-test probabilities and diagnostic test results. The operating characteristics of each diagnostic test were provided in terms of likelihood ratios, sensitivity/specificity, or a graphic display. There are 2 problems with this approach. First, familiarity with the method is required. When I finished medical school in the 1990s, I had been taught to think in probabilities (sensitivity, specificity and predictive values). In contrast, I was taught to think in odds (likelihood ratios) only later in my career, during biostatistical training. To assess the relative merits of likelihood ratios (versus sensitivity and specificity) on the basis of how well clinicians know how to use them is like determining the utility of the metric system based on whether New Englanders can more accurately estimate the length of their strides in inches versus centimeters. Second, the scenarios the authors formulated failed to expose the most serious conceptual error surrounding sensitivity and specificity. Clinicians are often not taught that both sensitivity and specificity are needed to assess post-test probability (whether the test result is positive or negative). Indeed, many medical students are taught that high sensitivity “rules out” a diagnosis if the test is negative and that high specificity “rules in” the diagnosis when the test is positive. The limitations of this rule-of-thumb are exposed by an inexpensive and quite versatile laboratory test that I have created. It has 97.2% sensitivity for pulmonary emboli, myocardial infarctions, and even erectile dysfunction. It is called the 2-dice test. I roll 2 dice, each with 6 sides, and add together the values showing. Anything 3 or higher is a positive test result. The problem is that the specificity is only 2.8%. Had Puhan et al presented a hypothetical scenario in which a test has 97.2% sensitivity and 2.8% specificity, the pre-test probability of disease was 50%, and the test was negative, I suspect that many of the physicians would have deemed the diagnosis very unlikely. In contrast, present the same physicians with a test that has a negative likelihood ratio of 1.0, and they will not be fooled. These same physicians are out there misinterpreting negative D-dimer tests in critically ill patients (1). 1. Brotman DJ, Segal JB, Jani JT, Petty BG, Kickler TS. Limitations of D-dimer testing in unselected inpatients with suspected venous thromboembolism. Am J Med. 2003;114(4):276-82. Conflict of Interest:None declared |
|||