Challenges in Systematic Reviews of Complementary and Alternative Medicine Topics
- Paul G. Shekelle, MD, PhD;
- Sally C. Morton, PhD;
- Marika J. Suttorp, MS;
- Nina Buscemi, PhD; and
- Carol Friesen, MA, MLIS
- From the Southern California Evidence-based Practice Center (RAND Corporation, Santa Monica, California; the Greater Los Angeles Veterans Affairs Healthcare System, Los Angeles, California; and the University of Alberta Evidence-based Practice Center, Edmonton, Alberta, Canada).
Abstract
The use of complementary and alternative medicine (CAM) continues to grow in the United States. The Agency for Healthcare Research and Quality has devoted a substantial proportion of the Evidence-based Practice Center (EPC) program to systematic reviews of CAM. Such syntheses present different challenges from those conducted on western medicine topics, and in many ways are more difficult. We discuss 3 challenges: identifying evidence about CAM, assessing the quality of individual studies, and addressing rare serious adverse events. We use illustrations from EPC evidence reports to show readers approaches to the 3 areas and then present specific recommendations for each issue.
Mark Helfand, MD, MPH; Sally Morton, PhD; Eliseo Guallar, MD, PhD; and Cynthia Mulrow, MD, MSc, Editors
The popularity of complementary and alternative medicine (CAM) in the United States continues to grow (1). The Agency for Healthcare Research and Quality (AHRQ) has devoted a substantial proportion of the Evidence-based Practice Center (EPC) program to reports on CAM. As of 2004, 19 of the more than 100 evidence reports concerned CAM interventions. The reports spanned a wide scope of topics, including botanicals and herbal supplements (such as garlic and ephedra), traditional medicine (such as acupuncture and ayurvedic medicine), and vitamins (such as supplementation).
In preparing CAM reports, EPC reviewers faced several challenges. In this paper, we discuss approaches used to address the following 3 issues: 1) identifying evidence about CAM interventions, 2) assessing the quality of individual studies, and 3) addressing rare serious adverse events. We then suggest recommendations in each area for future reviews of CAM interventions.
Challenge: Identifying Evidence about CAM Literature
The many biases in the publication and indexing of CAM research pose a challenge for locating literature. Publication bias refers to the tendency of investigators, reviewers, and editors to submit or accept manuscripts on the basis of the strength or direction of the findings. While publication bias is a concern in conventional medical research (2-4), in CAM research the issue is particularly complicated. Most studies published in leading CAM journals have positive results (5). Some countries, such as China, Japan, Russia, and Taiwan, publish more studies with positive results than studies with negative results; this imbalance may reflect publication bias (6).
Language-related publication bias also exists in CAM research, as it does for conventional medicine research (7). Negative CAM findings are more likely to be published in mainstream medical journals (that is, English-language journals), while positive CAM findings are more likely to be published in CAM journals (which tend to be non–English-language journals) (8-10). Thus, all else being equal, the exclusion of non–English-language reports may result in a lower estimate of intervention effect for CAM interventions than would be seen if these reports were included (10). The direction of bias, however, may depend on the CAM topic. Studies about some CAM therapies may initially be published only in non–English-language journals (8). On the other hand, in pediatric CAM, some evidence suggests that randomized, controlled trials (RCTs) tend to be published in English, and often in mainstream medical journals (9).
Another bias that hinders access is the incomplete or improper indexing of CAM journals and articles by mainstream databases such as MEDLINE (11, 12). For example, MEDLINE indexed only 10% of CAM journals identified worldwide by the National Center for Complementary and Alternative Medicine and the National Library of Medicine (12), while approximately 35% of all biomedical journals published worldwide are indexed in MEDLINE (13). Inconsistent use of keywords, descriptors, and subject headings, along with differing indexing procedures across databases, may also pose a challenge in locating CAM literature (14). The inconsistency in the terminology may partially explain why a MEDLINE search of the term alternative medicine does not capture all studies relevant to the CAM field (12). We examined the 19 evidence reports and 2 technology assessments related to CAM for sources and methods used to locate CAM literature (Appendix Table). For almost all reports, investigators searched specialized CAM databases in addition to mainstream databases. At least 10 databases specialize in CAM: Allied and Complementary Medicine Database (AMED); Cochrane Complementary Medicine Trials Registry; Alt HealthWatch; HerbMed; Manual, Alternative and Natural Therapy Index System (MANTIS); Natural Products Alert (NAPRALERT); International Bibliographic Information on Dietary Supplements (IBIDS); Hom-Inform; The Arthritis and Complementary Medicine Database (ARCAM); and Complementary and Alternative Medicine and Pain Database (CAMPAIN). Many can be accessed without a subscription. The search strategy used for most reports did not involve English-language restrictions. For almost all reports, investigators included an extensive list of keywords to identify articles that may not be indexed by standard subject headings, or for which terminology was inconsistent (14).
Only a few investigators used hand searching (15). The searching of grey literature is another method that slightly more than half of the investigators used to address publication bias. Grey literature was searched in about half the reports. For most reports, investigators specified that they had reviewed reference lists to identify articles improperly indexed by mainstream databases. However, investigators should be aware of the possibility of citation bias, a phenomenon whereby studies supportive of a beneficial effect are cited more frequently (16). There is evidence of this bias in mainstream medical literature (16), and it could be relevant to CAM. Similarly, for most reports, investigators specified that they had communicated with authors or experts for additional data and relevant studies. However, investigators should be aware of the possibility of bias in the provision of information, such that studies with positive results may be more likely to be communicated to investigators than are studies with negative results (7).
Challenge: Assessing and Summarizing Quality
Complementary and alternative medicine presents special challenges in the design and execution of studies, with respect to both internal validity and generalizability. These problems relate to the tension between specifying the intervention sufficiently that others can apply it and the desire to study CAM therapy as it is applied in CAM practices; they also concern the difficulties in controlling expectation bias (the systematic effect on the results of the participants' belief that a certain therapy will help them). Most CAM interventions are investigated only after they are so widespread that they can no longer be ignored, and by that time, the CAM practices are highly diversified in practices, personal experiences, biases, and expectations.
For some CAM therapies, specifying the intervention sufficiently is not a great problem, at least in theory. The use of herbs or dietary supplements such as St. John's wort, antioxidant supplements, glucosamine, and saw palmetto can be specified in a manner analogous to that used in drug trials, with a standard concentration or formula given regularly. (However, there is concern about the variation in formulation and bioactivity of some supplements from lot to lot, and this variation may be of even greater concern for botanicals in which the active ingredient is not known.)
For many other CAM therapies, however, the conceptual basis for the therapy requires an interaction between the practitioner and patient that modifies the therapy to the individual. Traditional Chinese acupuncture, spinal manipulation by chiropractors, and ayurvedic medicine require individualization of treatment based on an examination and understanding of the patient's condition using concepts that do not have an analogue in western allopathic medicine. Consequently, CAM advocates have criticized randomized clinical trials that reported no effect for not having allowed the necessary tailoring of the intervention. For this reason, “pragmatic” trials are frequently advocated for studying CAM.
In a pragmatic trial, patients are assigned to a CAM practitioner rather than a tightly specified CAM therapy. The CAM practitioners can provide their treatments in their usual fashion, individualizing the therapy for each particular patient. While this strategy allows the CAM practice to occur in its traditional fashion, it makes blinding or otherwise controlling expectation bias very difficult. Furthermore, while in one way individualizing the therapy increases generalizability, it also increases the sensitivity of the results to the skill of the practitioners. Since the intervention relies on practitioner expertise in understanding the patient and delivering the therapy, the study results are more difficult to apply to other practitioners. Thus, pragmatic trials should discuss the training and experience of the CAM practitioner. Large pragmatic trials that include many practitioners and that compare a CAM therapy with a credible control or alternative therapy would be particularly useful in assessing CAM.
The traditional way to manage expectation bias in drug trials is to use a matched placebo and a double-blind design. The magnitude of a potential “placebo response,” or even its existence at all, is controversial (17-20). However, there is agreement that the placebo response is particularly important in studies that use subjective patient measures of outcome, especially pain. Because chronic painful conditions account for a large proportion of the reasons patients seek CAM therapies, the use of placebo controls is especially important. Even medical procedures such as knee surgery have been shown to be amenable to studies with a sham control (21), so the issue of placebo and shams spans the gamut of CAM therapies. If CAM treatments, such as herbs and dietary supplements, are amenable to a matched-placebo, double-blind design, anything less is a limitation.
For many other CAM therapies, however, placebos and blinding are a much greater challenge. The use of sham procedures, rather than placebos, is possible when the CAM therapy being evaluated is well-specified. One mechanism for assessing the success of the “blinding” ability of a sham therapy is to survey patients after receipt of the therapy and ask them to guess whether they received active or sham therapy. Similar responses in patients receiving active and sham therapy are good evidence that the sham was successful in controlling expectation bias. Successful sham therapies have been reported for acupuncture and spinal manipulation (22-24).
Pragmatic trials, of course, do not compare CAM therapies with sham treatment. However, a patient's expectation of how likely it is that a therapy will be beneficial for their condition can substantially influence the results. In a study comparing massage, acupuncture, and usual care for patients with chronic low back pain, patients were asked their expectation before treatment. When patients were randomly assigned to treatments they thought more likely to be effective, they in fact reported greater efficacy (25). In a large pragmatic trial comparing chiropractic care with physical therapy, benefit of chiropractic care was much more pronounced in patients who originally presented to a chiropractor's office and were then randomly assigned to chiropractic care than it was in patients who initially presented to physical therapy and were then randomly assigned to chiropractic care (26). While blinding of patients or providers is clearly not possible in a pragmatic trial, it is possible to use outcome assessors who are blind to treatment. If feasible, such use should be considered a methodologic strength.
In sum, this variability in sham therapies across studies presents a challenge for a systematic review. If sham groups do vary in important ways, they cannot all be equated and treated as generic control groups in either narrative reviews or meta-analyses. The investigator must examine each sham treatment thoroughly, just as active-treatment groups are assessed. Categories of sham therapies may need to be constructed and each study classified. This categorization may result in a loss of sample size, since fewer studies will be comparable. Investigators should consider choosing the active treatment of interest, such as spinal manipulation, as the reference group, and then constructing effect sizes for alternative therapies versus this reference (27).
Challenge: Addressing Potentially Rare Serious Adverse Events
Although most CAM therapies are presumed to be safe, there is still the problem of how to assess and quantify the possibility of very rare, but potentially devastating, adverse events. Two examples illustrate this. Spinal manipulation given by U.S. chiropractors is performed tens of millions of times per year. A handful of reports describe devastating events, including death, stroke, and the cauda equina syndrome, after treatment with chiropractic spinal manipulation. Similarly, before it was removed from the U.S. market by the U.S. Food and Drug Administration (FDA), millions of portions of the herb ephedra were consumed. The FDA MedWatch program has received reports of several hundred very serious events (including death, heart attack, and stroke) that followed the consumption of ephedra (28). How are we to assess whether there are causal relationships between the use of spinal manipulation or the consumption of ephedra and these rare but serious events? If the benefit for any CAM therapy is modest or unproven, then the presence of even a very small increased risk for a serious event is enough to tip the scales against use of the CAM therapy.
Because these are rare events, RCT data will almost never be sufficient to prove or disprove a causal relationship between a CAM therapy and a rare adverse event. One of the best-studied CAM therapies is spinal manipulation. At this time, 38 RCTs have been published, with a total of 2302 patients who received spinal manipulation by itself or in addition to another treatment (27). No RCT reported a serious adverse event, and the upper bound of a one-sided 95% CI for the event rate with all patients combined is 0.13%. Similarly, at this time 52 RCTs of ephedra or purified ephedrine could be included in an adverse-event analysis (28). Again, none of these studies reported a serious adverse event, and the upper bound of the CI is 0.18%. Indeed, for a future randomized trial to have sufficient power to assess adverse events occurring at a rate of 1 per million, we calculate that such a study would need to enroll about 3 million treated patients. Clearly, this threshold will not be crossed for most CAM interventions, or any other interventions for that matter.
With RCT data insufficient to draw conclusions, the next place to look for evidence is hypothesis-testing observational studies. The case–control study is the traditional epidemiologic tool for evaluating possible relationships between exposure and rare events, such as the relationship between smoking and lung cancer or between phenylpropanolamine and stroke (29). Indeed, 1 case–control study concluded that case-patients were 5 times more likely than controls to have visited a chiropractor in the week before a vertebrobasilar accident (30). Unfortunately, most CAM therapies have not been the subject of case–control studies.
A third source of potential evidence is large registries of people who have received the intervention. Such postmarketing surveillance has produced sufficient evidence to warrant the removal of numerous drugs (such as troglitazone [31]) because of many unusual side effects, such as liver failure. Examination of large administrative databases is another way to assess a potential relationship between an intervention and an outcome. However, this type of study requires that the intervention be coded in administrative databases, and for many CAM therapies this is unlikely (or will underrepresent the actual delivery of the procedure) since only reimbursed interventions are coded for in administrative databases. This technique has been used for spinal manipulation but would not be useful for herbs or dietary supplements that are available over the counter.
The last form of evidence is the case report. In our ephedra analysis, we examined 66 case reports or series found in the published literature, 1474 reports submitted to MedWatch, and 18 502 reports submitted to Metabolife, a manufacturer of an ephedra dietary supplement (32). This analysis had several potential limitations. First, we did not have access to all reported adverse event files, and in any event, many authorities consider MedWatch case reports to underestimate the number of events. Many of the reports did not contain all the data that we needed to make assessments. How these cases might have influenced our findings had they contained appropriate documentation is unknown. Another important limitation is that we do not have an estimate of the number of people using ephedra. In addition, the use of ephedra and ephedrine is increasing over time, as is the probability that someone will report an adverse event because of publicity. Finally, the most important limitation is that an assessment of case reports is generally insufficient to reach conclusions regarding causality. Even with these limitations, the FDA regularly uses biological plausibility and the baseline rate of events to infer causality.
In sum, the possibility of rare but serious adverse events needs to be considered for CAM therapies, since benefits are usually small or unproven; thus, even small risks may be decisive. Controlled trials are the best way to assess causality, but will almost never have studied sufficient patients to assess risks at a rate greater than 1 in 1000. Case–control or cohort studies can be searched for, but these are rare. The decision to include case reports must be made on a case-by-case basis. Additionally, reviewers should carefully note whether the inclusion and exclusion criteria for a trial included a medical evaluation. Such evaluations could identify, and exclude, persons at presumed increased risk for an adverse event. The generalizability of safety findings to a general population that may have access to the CAM therapy without a medical evaluation—such as an over-the-counter herbal product—needs to be considered.
Conclusion and Recommendations
The Table summarizes our recommendations. The EPC evidence reports of CAM topics incorporated many sources and methods for locating CAM literature. Investigators of these reviews commonly searched specialized databases, used an extensive list of keywords, and communicated with authors and experts; they rarely used hand searching. All of the approaches addressed one of the key biases inherent in the publication and indexing of CAM research, although in some cases, the approaches themselves may be prone to bias. Nonetheless, a review of CAM literature necessitates that all of these sources and methods be used to identify relevant literature, including the use of non–English-language reports for particular CAM topics. Research should be conducted to determine how the various sources and methods of locating CAM literature affect the estimates of treatment effectiveness in CAM systematic reviews, among various topics of the field. For example, how is the estimate of treatment effect of systematic reviews of various CAM topics affected, if at all, by the use of hand searching for identification of potentially relevant articles?
Investigators should also clearly report how they avoided bias in selecting primary studies for inclusion in evidence reports, since evidence suggests that CAM systematic reviews tend to be weak in this respect (33). More research is required to determine the biases that may affect the identification of CAM literature, such as citation bias and bias in provision of information by authors and experts. In addition, more research should be conducted to determine whether the prevalence of biases differs across CAM topics and, therefore, whether the relative importance of various sources and methods for locating CAM literature differs across these topics.
In assessing the internal validity of CAM therapies, a reviewer should first decide whether the therapy in question is amenable to a placebo-controlled, double-blind design. If so, internal validity criteria similar to those used for drug trials should be assessed. If the CAM therapy is not amenable to placebo control, then the reviewer should determine whether the study assessed a defined CAM therapy; if so, a sham procedure could be used. In that case, the investigators should assess the sham, usually by asking patients via a survey to guess which therapy they received. Using this technique and reporting similar proportions of patients guessing that they had received active therapy provides good evidence that the sham was successful. Reporting unequal proportions, or failing to perform the survey at all, leaves the success of the sham in question.
In pragmatic trials, the reviewer should evaluate whether the investigators provided some information about the training of the CAM practitioners, such that the results of the study could be applied in other situations. In addition, pragmatic trials should determine the patient's expectations for therapy and consider whether this information might have influenced the treatment. Studies that lack such an assessment must necessarily be considered potentially more prone to expectation bias.
Finally, it is a challenge to identify evidence sufficient to prove or refute the existence of serious adverse events associated with CAM. Randomized trials will never be sufficient, and other hypothesis-testing studies are likely to be unavailable. Frequently, the review must rely on observational studies that lack a comparison group or case reports. Except in unusual circumstances, these types of evidence will not provide conclusive proof of the presence or absence of an elevated risk for an adverse event and at best can only provide a signal that a potential for increased risk exists.
Mark Helfand, MD, MPH; Sally Morton, PhD; Eliseo Guallar, MD, PhD; and Cynthia Mulrow, MD, MSc, Editors
Article and Author Information
-
Disclaimer: The authors of this article are responsible for its contents. No statement in this article should be construed as an official position of the Agency for Healthcare Research and Quality, the National Center for Complementary and Alternative Medicine, the Office of Dietary Supplements, the National Institutes of Health, or the U.S. Department of Health and Human Services.
-
Acknowledgments: The authors thank Dr. Terry Klassen for feedback on the manuscript, Di Valentine and Cony Rolon for their assistance, and Marilyn Josefsson for administrative support.
-
Grant Support: This research was performed by the Southern California Evidence-based Practice Center based at RAND, Santa Monica, California, with assistance from the University of Alberta Evidence-based Practice Center at Edmonton, Alberta, Canada, under contract with the Agency for Healthcare Research and Quality (contract 290-02-0003) and is based on work originally supported by the National Center for Complementary and Alternative Medicine and the Office of Dietary Supplements, both at the National Institutes of Health. Dr. Shekelle was a Senior Research Associate of the Veterans Affairs Health Services Research and Development Service.
-
Potential Financial Conflicts of Interest: Authors of this paper have received funding for Evidence-based Practice Center reports.
-
Requests for Single Reprints: Paul G. Shekelle, MD, PhD, RAND Corporation, 1776 Main Street, Santa Monica, CA 90401; e-mail, Paul_Shekelle{at}rand.org.
-
Current Author Addresses: Dr. Shekelle and Ms. Suttorp: RAND Corporation, 1776 Main Street, Santa Monica, CA 90401.
-
Dr. Morton: RTI International, 3040 Cornwallis Road, PO Box 12194, Research Triangle Park, NC 27709-2194.
-
Dr. Buscemi and Ms. Friesen: University of Alberta Evidence-based Practice Center, 11402 University Avenue, Room 9420, Edmonton, Alberta T6G 2J3, Canada.
RSS Feeds









