Screening for Colorectal Cancer: A Targeted, Updated Systematic Review for the U.S. Preventive Services Task Force
- Evelyn P. Whitlock, MD, MPH;
- Jennifer S. Lin, MD, MCR;
- Elizabeth Liles, MD;
- Tracy L. Beil, MS; and
- Rongwei Fu, PhD
Abstract
Background: In 2002, the U.S. Preventive Services Task Force (USPSTF) recommended colorectal cancer screening for adults 50 years of age or older but concluded that evidence was insufficient to prioritize among screening tests or evaluate newer tests, such as computed tomographic (CT) colonography.
Purpose: To review evidence related to knowledge gaps identified by the 2002 recommendation and to consider community performance of screening endoscopy, including harms.
Data Sources: MEDLINE, Cochrane Library, expert suggestions, and bibliographic reviews.
Study Selection: Eligible studies reported performance of colorectal cancer screening tests or health outcomes in average-risk populations and were at least of fair quality according to design-specific USPSTF criteria, as determined by 2 reviewers.
Data Extraction: Two reviewers verified extracted data.
Data Synthesis: Four fecal immunochemical tests have superior sensitivity (range, 61% to 91%), and some have similar specificity (97% to 98%), to the Hemoccult II fecal occult blood test (Beckman Coulter, Fullerton, California). Tradeoffs between superior sensitivity and reduced specificity occur with high-sensitivity guaiac tests and fecal DNA, with other important uncertainties for fecal DNA. In settings with sufficient quality control, CT colonography is as sensitive as colonoscopy for large adenomas and colorectal cancer. Uncertainties remain for smaller polyps and frequency of colonoscopy referral. We did not find good estimates of community endoscopy accuracy; serious harms occur in 2.8 per 1000 screening colonoscopies and are 10-fold less common with flexible sigmoidoscopy.
Limitation: The accuracy and harms of screening tests were reviewed after only a single application.
Conclusion: Fecal tests with better sensitivity and similar specificity are reasonable substitutes for traditional fecal occult blood testing, although modeling may be needed to determine all tradeoffs. Computed tomographic colonography seems as likely as colonoscopy to detect lesions 10 mm or greater but may be less sensitive for smaller adenomas. Potential radiation-related harms, the effect of extracolonic findings, and the accuracy of test performance of CT colonography in community settings remain uncertain. Emphasis on quality standards is important for implementing any operator-dependent colorectal cancer screening test.
Colorectal cancer ranks third in incidence and second in cause of cancer death for both men and women (1). Most cases of colorectal cancer occur in average-risk individuals (those without a family or predisposing medical history), and increasing age, male sex, and black race are associated with increased incidence (2). Black persons have the highest incidence of and mortality rates from colorectal cancer among all racial and ethnic subgroups (37) and nearly double the colorectal cancerrelated mortality rate compared with other ethnic minorities (8).
Colorectal cancer screening has been recommended by the U.S. Preventive Services Task Force (USPSTF) and many other organizations for more than 10 years (9). On the basis of evidence from multiple randomized, controlled trials (RCTs), a screening program with repeated annual or biennial guaiac fecal occult blood tests (FOBTs) and endoscopic follow-up of positive test results reduces colorectal cancer mortality; according to a recent update, colorectal cancer mortality was reduced 16% (CI, 10% to 22%) after 12 to 18 years (10). Extrapolating from trial evidence, clinical studies of test accuracy, and other supporting evidence, the USPSTF recognized flexible sigmoidoscopy (with or without FOBTs), colonoscopy, and double-contrast barium enema as other colorectal cancer screening options in 2002 (11, 12). However, because colorectal cancer screening tests have potential harms, limited accessibility, or imperfect acceptability to patients, and no tests could be identified as superior in cost-effectiveness analysis (13), the USPSTF also recommended that choice among recommended methods for colorectal cancer screening to be individualized to patients or practice settings (14).
Despite strong recommendations from the USPSTF and many others, serial national surveys document inadequate, slowly improving rates of colorectal cancer screening in the United States (1520). In 2006, 60.8% of adults 50 years of age or older reported recent colorectal screening (20). Disparities in colorectal cancer screening exist, with lower rates of colorectal cancer screening in nonwhite and Hispanic populations (16, 21, 22) and in areas with higher poverty rates (23).
To increase the uptake of and benefits from recommended colorectal cancer screening, researchers have sought to improve the accuracy, acceptability, or accessibility of screening by introducing new tests or enhancing existing tests. However, the availability of additional options for colorectal cancer screeningincluding highly sensitive guaiac FOBT; fecal immunochemical testing; fecal DNA testing; and virtual colonoscopy approaches, such as computed tomographic (CT) colonographyhas created uncertainty about what methods should be used for colorectal cancer screening in the general population.
To assist the USPSTF in updating its 2002 recommendation for colorectal cancer screening in average-risk adults age 50 years or older, we conducted a targeted systematic review primarily focused on evidence gaps or new evidence since the previous review. This approach updated what the USPSTF judged was the most important evidence for newer colorectal cancer screening tests and community-performed endoscopies, and it was supplemented by a companion decision analysis examining screening program performance and life-years gained by using different colorectal cancer screening tests, test intervals, and starting and stopping ages (24).
Methods
Under guidance from the USPSTF, this targeted review addressed only the first 3 questions of the full evidence chain in the analytic framework (Figure 1). From our larger report (25), we report here the accuracy (one-time test performance characteristics) and potential harms of newer colorectal cancer screening tests (high-sensitivity FOBTs, fecal immunochemical tests, fecal DNA testing, and CT colonography) in screening populations (key questions 2b and 3b) and the accuracy and harms of screening colonoscopy and flexible sigmoidoscopy in the community setting (key questions 2a and 3a). In the full report, we discuss lack of new data on the mortality benefits of colorectal cancer screening beyond FOBT programs (key question 1); race-, sex-, and age-related issues in colorectal cancer screening; considerations of targeted screening recommendations; and suggested future research. Detailed methods are provided in the Appendix and Appendix Tables 1, 2, and 3 and in the full report (25).
KQ1: What is the effectiveness of the following screening methods (alone or in combination) in reducing mortality from colorectal cancer? a. Flexible sigmoidoscopy, b. Colonoscopy, c. Computed tomographic (CT) colonography, d. Fecal screening tests: i. High-sensitivity guaiac fecal occult blood test (FOBTs); ii. Fecal immunochemical test; iii. Fecal DNA test. KQ2a: What are the sensitivity and specificity of 1) colonoscopy and 2) flexible sigmoidoscopy when used to screen for colorectal cancer in the community practice setting? KQ2b: What are the test performance characteristics of 1) CT colonography and 2) fecal screening tests (as listed in KQ1d) for colorectal cancer screening, as compared to an acceptable reference standard? KQ3a: What are age-specific rates of harm from colonoscopy and flexible sigmoidoscopy in the community practice setting? KQ3b: What are the adverse effects of newer tests, including 1) CT colonography and 2) fecal screening tests (as listed in KQ1d)?
Searches and Selection Process
In brief, we searched PubMed; Database of Abstracts of Reviews of Effects; Cochrane Database of Systematic Reviews; and the Institute of Medicine, National Institute for Health and Clinical Effectiveness, and Health Technology Assessment databases for recent systematic reviews (19992006) to support our review of all key questions (26). We found 11 existing systematic reviews for newer colorectal cancer screening tests (key question 2b). Using methods detailed in the Appendix, we selected 3 good-quality reviews of CT colonography (27, 28) or fecal DNA testing (29) to locate relevant primary studies; we supplemented these with additional MEDLINE and Cochrane Library searches from January 2006 through January 2008 to locate additional studies published after the end date of the searches. Because there were no good-quality relevant systematic reviews for reports on fecal immunochemical tests (key questions 2b and 3b), we searched MEDLINE and the Cochrane Library (19902008) and from 2000 to 2008 to locate studies of the harms of screening tests (key questions 3a and 3b) since the 2002 report.
Abstracts and articles were dual-reviewed against inclusion criteria (Appendix) and required agreement of 2 reviewers. Eligible studies reported on the sensitivity and specificity of colorectal cancer screening tests or on health outcomes. We excluded studies that did not address average-risk populations for colorectal cancer screening, unless an average-risk subgroup was reported. We excluded casecontrol studies of screening accuracy because these may overestimate sensitivity as a design-related source of bias (30), as recently demonstrated for FOBTs (31). To avoid biases related to reference standards, we excluded studies of test accuracy that incompletely applied a valid reference standard or used an inadequate reference standard (32). For CT colonography, we considered only technologies that were compared with colonoscopy in average-risk populations, used a multidetector scanner (27), and reported per-patient sensitivity and specificity. In all, we evaluated 3948 abstracts and 490 full-text articles (Figure 2).
KQ= key question; SER= standardized evidence review. For list of key questions, see legend for Figure 1.
Quality Assessment and Data Abstraction
Two investigators critically appraised and quality-rated all eligible studies by using design-specific USPSTF criteria (33) supplemented by other criteria (Appendix). Poor-quality studies were excluded. One investigator abstracted key elements of included studies into standardized evidence tables. A second reviewer verified these data. We resolved disagreements about data abstraction or quality appraisal by consensus. Evidence tables and tables of excluded studies for each key question are available in the full report (25).
Data Synthesis and Analysis
We report qualitative synthesis of the results for most key questions because of study heterogeneity. The performance of screening tests is preferentially described per person (sensitivity and specificity), supplemented by per-polyp analyses (miss rates). Sensitivity for large adenomas from 2 similar studies of CT colonography screening was combined by using the inverse variance fixed-effects model because no heterogeneity was detected on the basis of the Cochran Q test and the I 2 statistic (34). Because of the stringency of our inclusion criteria for studies to estimate rates of endoscopy harms in the community practice setting (key question 3a), included studies were clinically homogeneous enough to pool. A random-effects logistic model was used to evaluate statistical heterogeneity, estimate pooled rates, and explore potential sources of variation for complications from study-level characteristics (35, 36). Model details and SAS PROC NLMIXED code are provided in the Appendix. Total serious adverse events required hospital admission (for example, perforation, major bleeding, severe abdominal symptoms, and cardiovascular events) or resulted in death. Results of exploratory analyses for potential sources of variation for pooled estimates are discussed in the full report, along with pooled estimates for individual complications, such as perforations (25).
Role of the Funding Source
The Agency for Healthcare Research and Quality funded this work, provided project oversight, and assisted with internal and external review of the draft evidence synthesis but had no role in the design, conduct, or reporting of the review. The authors worked with 4 USPSTF members to develop the analytic framework, set the review scope, and resolve methodologic issues during the conduct of the review. The draft systematic review was reviewed by 8 external peer reviewers and was revised for the final version.
Results
Our results are organized by screening method rather than key question, with newer tests discussed first. More detailed results, including evidence tables for each key question, are available in the full report (25).
Fecal Immunochemical Tests, Hemoccult SENSA, Fecal DNA, and CT Colonography (Key Questions 2b and 3b)
We evaluated 3 categories of newer fecal colorectal cancer screening tests (fecal immunochemical testing, high-sensitivity guaiac FOBT, and fecal DNA testing) and CT colonography. Among these, the largest body of fair- or good-quality evidence with which to evaluate performance of colorectal cancer screening tests in average-risk screening populations was for several different fecal immunochemical tests, followed by Hemoccult SENSA (Beckman Coulter, Fullerton, California), CT colonography, and fecal DNA testing.
Accuracy of Newer FOBTs
Although we found 9 fair- or good-quality cohort studies evaluating fecal immunochemical tests in 86498 average-risk persons, these tests cannot be clearly analyzed as a class (37). Therefore, we grouped results by test type for 4 different tests (Table 1). Limited data suggest better detection of colorectal cancer and large adenomas with 2 to 3 days of sample collection for FOBTs than with 1 day of sample collection. With few exceptions, studies did not directly compare fecal immunochemical tests with each other or with regular or high-sensitivity Hemoccult testing.
Overall, fecal immunochemical tests had higher sensitivity for colorectal cancer (61% to 91%) (3846) than was reported for nonrehydrated Hemoccult II (25% to 38%) in another recent systematic review (31) and in the only study of fecal immunochemical testing that also evaluated Hemoccult II (39). Estimated specificity varied across fecal immunochemical tests (91% to 98%), and, in most studies, specificity appears lower than the reported specificity of nonrehydrated Hemoccult II (98% to 99%) (39). Sensitivity for advanced neoplasia or large adenomas was less commonly reported but ranged from 27% to 67% for fecal immunochemical tests (39, 40, 4345). The sensitivity of nonrehydrated Hemoccult II for large adenomas has been estimated at 16% to 31% (31). The single study directly comparing HemeSelect and nonrehydrated Hemoccult II reported twice the sensitivity for polyps 10 mm or greater for HemeSelect (SmithKline Diagnostics, San Jose, California) (67% vs. 31%) (39). Currently, U.S. Food and Drug Administration (FDA)approved fecal immunochemical tests with fair- or good-quality studies of screening test performance are largely not available on the U.S. market. Of the 4 fecal immunochemical tests discussed here, few were both FDA approved and on the U.S. market at the time this article was written.
Hemoccult SENSA had higher sensitivity for colorectal cancer (64% to 80%) than would be expected for Hemoccult II but lower specificity (87% to 90%) (38, 39) (Table 1). In direct comparisons, Hemoccult SENSA was less sensitive for colorectal cancer (64%) than was FlexSure OBT/Hemoccult ICT (82%) but more sensitive for large adenomas (41% vs. 30%). Hemoccult SENSA was more sensitive for colorectal cancer (79%) than HemeSelect (69%) but had similar sensitivity for large adenomas (69% vs. 67%, respectively). Hemoccult SENSA was less specific for colorectal cancer and for adenomas compared with both fecal immunochemical tests (38). More people would be referred for colonoscopy with Hemoccult SENSA than with fecal immunochemical tests because of 2- to 3-fold higher rates of positive test results with the former. A combination Hemoccult SENSA/FlexSure screening approach, in which the fecal immunochemical test was developed only if the guaiac-based test result was positive, had identical sensitivity and better specificity compared with Hemoccult SENSA alone (98.1% vs. 90.1%). These estimates provide relative rather than absolute sensitivity or specificity because patients with negative results underwent flexible sigmoidoscopy (or registry follow-up) only.
Accuracy of Fecal DNA Testing
Eligible fecal DNA screening studies were limited to a fair-quality large cohort study that used a multitarget fecal DNA panel test (the precommercial version of PreGen Plus, version 1 [Exact Sciences, Marlborough, Massachusetts], which tests for 21 DNA mutations in the K-ras, APC, and p53 genes, along with markers for microsatellite instability and long DNA) in average-risk patients undergoing colonoscopy (47) and a smaller cohort study that tested a single mutation of the K-ras gene (48). We will not further discuss the test for the single K-ras gene mutation because it showed zero sensitivity: It was positive in none of the 31 participants with advanced colorectal neoplasia, including 7 patients with invasive colorectal cancer.
Researchers compared a one-time application of PreGen Plus (version 1.0) with 3-card nonrehydrated Hemoccult II in a study that enrolled 5486 average-risk asymptomatic patients who were all to undergo colonoscopy (47) (Table 1). Among the 4404 that adhered to all 3 tests, a subset (n= 2507; mean age, 69.5 years; 45% male; 87% white; 14% with a positive family history) was selected for fecal DNA testing on the basis of colonoscopic and histopathologic results.
Test performance for fecal DNA was compared with that for Hemoccult II in the selected subgroup; among these patients, 8.2% had positive results on the fecal DNA panel and 5.8% had positive Hemoccult II results. One-time fecal DNA testing was more sensitive for adenocarcinoma than was Hemoccult II (sensitivities of 51% [CI, 34.8% to 68.0%] and 12.9% [CI, 5.1% to 28.9%], respectively). Both fecal DNA testing and Hemoccult II had poor sensitivity for advanced carcinoma. Although specificity for minor polyps or no polyps did not differ between fecal DNA and Hemoccult II, power to detect a difference may have been limited because the full sample was not tested.
Serious Harms of Fecal Colorectal Cancer Screening
We found no studies addressing serious adverse effects from any type of fecal colorectal cancer screening tests. Risks are most likely related to false-positive test results and the associated risks from unnecessary colonoscopy screening.
Accuracy of CT Colonography
Although we located 7 fair- or good-quality cross-sectional studies (4955) examining a total of 4468 average-risk patients screened for colorectal cancer with both CT colonography and same-day colonoscopy, 3 of these (5052) did not contribute to our estimates of CT colonography test performance because of study limitations described in our larger report (25). The 4 remaining studies discussed here examined CT colonography screening in 4312 average-risk patients (Table 2); 3 of these studies also estimated colonoscopy sensitivity (49, 53, 54).
The 2 largest and most comparable and applicable studies were conducted by Pickhardt and colleagues (49) and the American College of Radiology Imaging Network (ACRIN) (55) and together represent 87% of patients. These 2 studies found that CT colonography was comparable to colonoscopy for detecting large adenomas (10 mm), but not necessarily for smaller adenomas (6 mm). Pooled sensitivity for large adenomas in these 2 studies was 92% (CI, 87% to 96%), with no statistical heterogeneity detected between the studies (I 2= 0%; P= 0.42). Point estimates for the sensitivity of CT colonography for smaller adenomas in ACRIN (78% [CI, 71% to 85%]) were 11% lower than for Pickhardt and colleagues' study (88.7% [CI, 82.9% to 93.1%]) and significantly lower than estimates for optical colonoscopy obtained by using an enhanced reference standard of segmental unblinding (49). In addition, although CIs for sensitivity for detecting smaller adenomas overlap with those for the sensitivity for larger adenomas within both studies, intervals are wide. We did not pool sensitivity estimates for smaller adenomas because the 2 studies had quite different results, which were also statistically heterogeneous. This finding suggests uncertainty about the true sensitivity of CT colonography for smaller adenomas. Of note, the sensitivity of CT colonography for at least 1 of the studies (55) is predicated on CT colonographydetected lesions that were 5 mm or greater, although these would not be the basis for referral for colonoscopy. The authors report that using a radiologic threshold of 6 mm for CT colonographydetected lesions reduced the sensitivity for large adenomas to 88%; similar data to estimate the change in sensitivity for smaller lesions are not provided. Sensitivity estimates for large adenomas or tumors for the 5-mm threshold varied among radiologists (from 67% to 100%), with fewer than half of radiologists detecting 100% of the 1 to 13 large adenomas in the cases they read. One of 7 colorectal tumors was missed on CT colonography in 1 study (55), whereas both colorectal tumors were detected by CT colonography in the other (49).
Per-patient specificity of CT colonography for small or large adenomas varied between the 2 largest studies. One study that used segmental unblinding to clearly distinguish false-positive CT colonography findings from false-negative colonoscopy findings had statistically significantly worse specificity (79.6% [CI, 77.0% to 82.0%]) for lesions 6 mm or greater, compared with 96% specificity for lesions 10 mm or greater (49). In contrast, ACRIN reported similar specificity for lesions regardless of size, with better specificity (88% [CI, 84% to 92%]) for lesions 6 mm or greater than reported by Pickhardt and colleagues (55). We did not pool specificity estimates because between-study results were too different and were statistically heterogeneous. In the ACRIN study, 40% (CI, 33.5% to 46.3%) of patients with lesions 6 mm or greater detected on CT colonography had lesions 6 mm or greater detected on colonoscopy.
Sensitivity and specificity estimates from 2 smaller fair-quality studies comparing CT colonography with colonoscopy are less informative because these studies detected relatively few lesions and their primary purposes were 1) to examine the relative accuracy of 2-dimensional vs. 3-dimensional methods for displaying and reviewing CT colonography images and 2) to compare radiologist performance (53, 54). Thus, these studies do not provide overall results for the population but rather report subsets of data to compare readers or technologies. Results are generally consistent, with better sensitivity for larger (compared with smaller) lesions, no clear differences between 2- and 3-dimensional approaches (which was confirmed by ACRIN), and some degree of interreader variability (which seems exaggerated in these studies because of small numbers of lesions).
The pooled sensitivity estimates for large adenomas provided here might be considered best-case estimates because the studies had very low (<1%) rates of inadequate examinations, used standardized CT technologies, used fecal tagging and contrast-based luminal fluid opacification, and used a limited number of very experienced radiologists for all readings. In addition, we know little about the sensitivity of CT colonography for flat adenomas from these studies. In a related report from the study by Pickhardt and colleagues (56), the per-lesion sensitivity for flat adenomas 6 mm or greater (82.8%) was reported to be similar to the sensitivity for polypoid adenomas 6 mm or greater (86.2%). This determination, however, was based on a total of 29 flat adenomas 6 mm or greater, with flat polyps found in 52 of 1233 persons (4.9%) (56).
On the basis of a referral threshold of any polyp 6 mm or greater, these studies suggest that 1 in 3 to 1 in 8 persons screened with CT colonography would be referred for colonoscopy.
Serious Harms of CT Colonography
Few serious, procedure-related harms (for example, perforation, major events requiring medical attention) have been reported in 6 fair-quality cohort studies that addressed potential adverse effects with CT colonography screening (49, 54, 55, 5759). Overall, the risk for perforation with screening CT colonography in asymptomatic persons seems very low, with no perforations reported in 2 studies of 14238 screening CT colonographies (55, 57) or in a study of 3120 CT colonographies (54). In 1 study, however, 1 person among 2531 persons undergoing both CT colonography and colonoscopy was hospitalized for bacteremia (55). Among 11870 screening and diagnostic CT colonography examinations, researchers reported just 1 perforation in the subgroup of persons undergoing screening CT colonography, compared with 6 in the subgroup undergoing diagnostic CT colonography (59). Two small studies (n= 1587) did not report on perforation rates but did report that no major adverse events occurred (49, 58).
Harms related to bowel preparations required for CT colonography, colonoscopy, or flexible sigmoidoscopy are considered in the larger report (25).
Uncertain Effects of CT Colonography Screening
Uncertainties associated with CT colonography screening include potential long-term harms from CT colonographyrelated radiation exposure. In addition, because CT colonography produces images of structures outside the colon, the implications of extracolonic findings that occur with CT colonography screeningincluding potential benefits from early disease detection as well as harms from unnecessary medical testing and anxietyare unclear.
We identified no studies that directly measured harms caused by low-dose radiation exposure from CT. However, existing models can indirectly estimate potential adverse effects for lifetime attributable risk for cancer by extrapolating the cancer-related risks at the range of effective radiation doses reported for CT colonography from existing risk models based on much higher radiation exposure. On the basis of 2 reviews, total radiation exposure with CT colonography ranges from 1.6 to 24.4 mSV for dual positioning (both supine and prone), with a median dose estimate of 8.8 mSv or 10.2 mSv per examination (60, 61). On the basis of the National Research Council's Biological Effects of Ionizing Radiation (BEIR) VII phase 2 report findings (62), the National Research Council predicts that approximately 1 additional individual per 1000 would develop cancer (solid cancer or leukemia) from exposure to 10 mSv above background (according to the linear no-threshold model). Because of limitations in the data used to develop this model, these risk estimates are uncertain and could vary by a factor of 2 or 3 (62). In addition, some organizations believe that the linear no-threshold model is an oversimplification that may overestimate the risk for malignancy (63).
Extracolonic findings detected by CT colonography are common, occurring in 27% to 69% of persons screened with CT colonography (Appendix Table 4). We identified 9 studies (n= 12557) that reported estimates of extracolonic findings in asymptomatic persons (49, 55, 6470). In these studies, classification of extracolonic findings varied but generally considered 3 types of clinical significance: high (findings that require surgical treatment, medical intervention, or further investigation), moderate (findings that would not require immediate medical attention but would probably require recognition, investigation, or future treatment), and low (findings that would not require further investigation or treatment). These 3 categories generally map to the CT Colonography Report and Data System (C-RADS) (71), as described elsewhere (25). Extracolonic findings of high clinical significance (for example, indeterminate solid organ masses or chest nodules, abdominal aortic aneurysms3 cm, aneurysms of the splenic or renal arteries, or adenopathy>1 cm) occurred in 4.5% to 11% of asymptomatic populations (49, 6567, 69, 70). Extracolonic findings of moderate clinical significance (such as renal calculi and small adrenal masses) were equally or more common and occurred in up to 27% (49, 64, 65, 6770). Because all extracolonic findings of high significance, along with some moderate findings, would require medical follow-up, these have the potential for additional morbidity and cost, as well as potential benefit. Across studies, approximately 7% to 16% of persons undergoing CT colonography were recommended to have additional diagnostic evaluation for extracolonic findings (55, 64, 65, 67, 68, 70). Only a minority of these findings ultimately warranted definitive treatment (for example, repair of abdominal aortic aneurysm, resection of malignant lesions, or chemotherapy for metastatic lesions) (64, 65, 6870). Although these estimates provide important contextual information, they are limited by the available studies, which varied greatly in their ability to accurately assess follow-up and in the duration of follow-up, the longest of which was 2 years.
Colonoscopy and Flexible Sigmoidoscopy in Community Settings (Key Questions 2a and 3a)
Accuracy of Colonoscopy
Evaluating the accuracy of screening colonoscopy in average-risk participants, particularly in community settings, is challenging because of the lack of an independent gold standard and very few applicable studies. As detailed in the full report (25), we found no studies of miss rates after tandem screening colonoscopy in average-risk patients to fairly represent performance of community endoscopists, and no studies of repeated colonoscopy within 3 years after screening colonoscopy in a representative sample of average-risk community-based patients.
Researchers have used CT colonography screening studies already discussed (49, 53, 54) to estimate the sensitivity of colonoscopy for colorectal cancer and for adenomas of various sizes detected using either CT colonography or colonoscopy. Two of these studies conducted CT colonography followed by colonoscopy with segmental unblinding to recheck CT colonographylocated lesions not seen on first-pass colonoscopy (49, 53); 1 of these provides the single best estimate for community performance of colonoscopy (49) (Table 2). In this good-quality study of 1233 average-risk persons, colonoscopy by 1 of 17 experienced colonoscopists missed 10% of adenomas 6 mm or greater and 12% of adenomas 10 mm or greater. Sensitivity (per-person detection rate) of colonoscopy for adenomas 6, 8, or 10 mm or greater did not statistically significantly differ from sensitivity of CT colonography. Colonoscopy missed 1 of 2 colorectal lesions detected, whereas CT colonography detected both. In the second study using segmental unblinding, no colorectal cancer was detected in 96 average-risk patients using either test, and colonoscopy by 1 of 5 gastroenterologists missed 10% of polyps 6 mm or greater but no polyps 10 mm or greater. Colonoscopy was much less accurate in the third study of 452 asymptomatic, average-risk patients, detecting only 77% (20 of 26) of neoplasms 10 mm or greater and just 1 of 5 colorectal lesions detected by CT colonography (53). This study, however, evaluated the performance of more than 50 experienced endoscopists, whereas CT colonography was conducted by 3 very experienced radiologists.
Taken together, these data are insufficient to provide precise estimates of the sensitivity of colonoscopy in community settings, particularly for colorectal cancer detection, because of the small number of patients studied (n= 1781) and the relatively few lesions (7 total colorectal lesions). They do, however, confirm that colonoscopy misses some polyps and may also miss colorectal cancer.
Serious Harms from Colonoscopy
We found 17 fair- or good-quality, primarily prospective, studies evaluating clinically significant adverse events from screening colonoscopy conducted in predominantly asymptomatic persons (49, 55, 67, 7285). Only 1 of these studies (81) was included in the 2002 systematic review for the USPSTF. Seven of these 16 studies were conducted in community settings (55, 73, 75, 77, 79, 8183). Using a random-effects logistic model to pool data from the 12 studies (n= 57742) (49, 55, 7376, 79, 80, 8285) reporting this outcome, we found 2.8 total serious complications (including perforations, hemorrhage, diverticulitis, cardiovascular events, severe abdominal pain, and death) per 1000 procedures (CI, 1.5 to 5.2 per 1000 procedures; test for heterogeneity; P= 0.13) (Appendix Figure 1). When we limited the model to the 7 studies conducted in the United States, serious complications were nonsignificantly reduced (2.5 per 1000 procedures [CI, 1.0 to 6.1 per 1000 procedures]). Because of reporting limitations, complication rates could not be calculated for colonoscopies with and without polypectomy. Only 3 of these 11 studies reported the proportion of colonoscopies in which polypectomies were performedthe proportions ranged from 41% to 68% (79, 80, 82). In these 3 studies, more than 85% of serious complications, perforations, and major bleeding incidents occurred during colonoscopies that required polypectomies. We could not estimate complications by age because of limitations in study reporting.
Test for heterogeneity for all studies based on logit of proportions using a random-effects model (P= 0.13).
* 95% CIs are exact confidence intervals.
Accuracy of Flexible Sigmoidoscopy
We found no studies that estimated accuracy of flexible sigmoidoscopy in average-risk patients undergoing screening with both flexible sigmoidoscopy and colonoscopy. We report here the accuracy of screening with simulated flexible sigmoidoscopy reported in 6 large cohort studies of screening colonoscopy in a total of 14938 average-risk patients (8691). Elsewhere (25), we describe 3 studies1 tandem flexible sigmoidoscopy study that reported adenoma miss rates (92) and 2 prospective studies that reported distal advanced neoplasia or colorectal cancer on flexible sigmoidoscopy repeated 3 years after negative results on screening flexible sigmoidoscopy (93, 94)that do not provide any greater precision than these estimates.
The estimated sensitivity of flexible sigmoidoscopy (using either biopsy or visual inspection to determine colonoscopy referral) for colorectal cancer throughout the entire colon was 58% to 75%, based on small numbers of colorectal lesions, with an estimated sensitivity of 72% to 86% for advanced neoplasia. Variations in these estimates are probably due to differences in examiner skill and the patient's risks for proximal lesions in the unexamined colon. These estimates are further limited because they simulate flexible sigmoidoscopy results by using colonoscopy examinations. This approach presumes that all lesions are detected if they are within the insertion depth for flexible sigmoidoscopy and ignores differences introduced through the more thorough bowel preparation used for colonoscopy or through colonoscopists' skill. The community performance of flexible sigmoidoscopy screening and its effect on health outcomes, including mortality from colo-rectal cancer, will become clearer after current RCTs are reported.
Serious Harms from Flexible Sigmoidoscopy
We found 8 fair- or good-quality studies that evaluated clinically significant adverse events from flexible sigmoidoscopy for colorectal cancer screening in an average-risk population (72, 74, 84, 85, 9598). Only 1 of these studies was included in the 2002 review (72).
Using a random-effects logistic model to pool data from the 6 studies (72, 74, 84, 85, 95, 96) reporting this outcome (n= 126985), we found 0.34 serious complication per 1000 procedures (CI, 0.06 to 1.9 per 1000 procedures; test for heterogeneity, P= 0.26) (Appendix Figure 2). Serious complications were defined the same as for screening colonoscopy but excluded complications from follow-up colonoscopy. Per protocol, all of these studies performed polypectomy during flexible sigmoidoscopy; based on 2 studies, polypectomies were conducted in 20% to 22% of flexible sigmoidoscopy examinations (72, 74). We could not estimate complications by age because of limitations in study reporting.
Discussion
Since 2002, research on colorectal cancer screening has grown substantially as researchers have investigated the accuracy of novel screening approaches and have continued examining already recommended approaches. As discussed in our full report (25), we found no new reports of the mortality impact of colorectal cancer screening (besides FOBT programs); however, results from several trials of flexible sigmoidoscopy that will report mortality effects are pending (84, 99101). In addition, although we found many studies addressing test performance of newer FOBTs, fecal DNA screening tests, or CT colonography (25), relatively few addressed average-risk screening populations and used minimally acceptable study designs and methods. Table 3 review findings about the performance and harms of new fecal screening tests, CT colonography, colonoscopy, and flexible sigmoidoscopy by key question, with newer tests reported first.
Recent guidance articulates evidence requirements to justify replacing a currently recommended diagnostic (or screening) test with a newer test in the absence of RCTs showing benefit (102, 103); this pertains to replacing existing colorectal cancer screening tests with newer ones. Accordingly, researchers should evaluate the comparative accuracy of newer and older tests by using the same reference standard as trials that showed treatment benefit in the same (or similar) patients representing the appropriate disease spectrum (103). If the newer test has increased sensitivitywith similar specificity and patient safetyor similar sensitivity but other advantages (for example, improved specificity, acceptability, or accessibility), studies of test accuracy alone may support substituting this test in the absence of trial data (103). However, when new tests offer tradeoffs between desirable and undesirable attributes (for example, improved sensitivity but reduced specificity), a decision analytic model or new research may be needed. When data on new tests are incomplete or uncertain, and the costs or consequences of making assumptions from such data are potentially severe, clinicians may require further research before acting (103).
Fecal Screening Tests
As determined primarily through indirect comparisons, several fecal immunochemical tests had superior single-test sensitivity for colorectal cancer and possibly for advanced neoplasia compared with Hemoccult II. Fecal immunochemical tests had similar or somewhat lower specificity, suggesting that test choice might be important when considering substituting fecal immunochemical tests in a fecal screening program. For one quantitative fecal immunochemical test (Magstream, Fujirebio Inc., Tokyo, Japan), choice of positive cutoff values would allow programs to determine the appropriate tradeoff between improved sensitivity and specificity. Limited evidence suggested better test performance with 2- or 3-day sample collection than with 1-day collection. Ease of administration may work in favor of some fecal immunochemical tests (31), although their increased costs may reduce acceptability for payers. The relatively small increase in Medicare reimbursement for fecal immunochemical tests (exceeding those for Hemoccult II) (104) may be affecting market availability. Not all well-studied fecal immunochemical tests were both FDA approved and on the U.S. market at the time this article was written.
On the basis of fewer data and less precise estimates, Hemoccult SENSA also had increased sensitivity for colorectal cancer compared with Hemoccult II but reduced specificity. Direct comparisons with fecal immunochemical tests were few, with mixed results for sensitivity and consistently lower specificity for Hemoccult SENSA. The tradeoffs from improved sensitivity with reduced specificity in a screening program of repeated testing is best evaluated through modeling (24).
One study on screening test performance of the precommercial version of a multitarget fecal DNA test (PreGen Plus) showed improved sensitivity for colorectal cancer but not adenomas, similar or slightly reduced specificity, and higher positive rates compared with Hemoccult II (47). Test accuracy estimates for colorectal cancer were imprecise for both tests because of power, and sensitivity and specificity of Hemoccult II in this study were lower than generally reported in higher-quality studies (31, 105). In addition, this study's findings may not be generalizable to population screening because participants were relatively older (three quarters were>65 years of age, compared with screening beginning at age 50 years) and the version of PreGen Plus tested has been supplanted by other versions (1.1 and higher) for which there are no screening population studies (Table 3). Commercial availability of fecal DNA tests may be further affected by the recent FDA requirement for premarket review of this test, which was previously considered to be outside FDA jurisdiction (106, 107). Furthermore, in the absence of trial data or modeling, fecal DNA could be considered only as a substitute for an annual or biennial FOBT in established screening programs. This could be cost-prohibitive given the relative cost for fecal DNA compared with guaiac or immunochemical tests (104). Cost concerns may underlie recommendations by the manufacturer to repeat fecal DNA screening at 5-year intervals (108). Data on health outcomes are insufficient, however, to support this interval recommendation (109).
Accuracy, Harms, and Uncertainties with CT Colonography
Computed tomographic colonography has been studied as a diagnostic test (for patients with symptoms) and, less frequently, as a screening test in average-risk asymptomatic patients. Recent publication of the ACRIN study has more than doubled the number of average-risk patients studied to determine the accuracy of CT colonography for colorectal cancer screening (55), with only 1 smaller screening study (n= 300) still pending (110). On the basis of published studies in 4312 average-risk screening patients, CT colonography screening by trained and experienced radiologists had sensitivity similar to that of colonoscopy for colorectal cancer and large adenomas (10 mm). However, estimates of sensitivity of CT colonography for smaller adenomas (6 mm) was more variable between studies (with point estimates of 78% and 88.7% and wide CIs) and was not clearly comparable to the sensitivity of colonoscopy for smaller adenomas. The health impact of potentially reduced sensitivity for smaller polyps is unclear (111). Specificity estimates for CT colonography were also quite variable between studies; for lesions 6 mm or greater, point estimates ranged from 79.6% to 88%.
Beyond issues of test accuracy, other uncertainties may affect considerations of whether this test is ready for widespread population screening. These include questions about potential harms from radiation exposure, uncertainty about extracolonic findings, uncertainty about test referral thresholds and repeat test intervals, and judgments about how the test performance seen in clinical studies will translate to the conduct of CT colonography screening examinations in community settings. Most important is how clinicians and policymakers value these remaining uncertainties and whether the costs or consequences of making assumptions from incomplete data are viewed as potentially severe, thus requiring further research before acting (103).
Immediate procedure-related harms with CT colonography appear to be minimal. The risk for perforation with air insufflation is very low, particularly in asymptomatic persons undergoing screening. Uncertainty remains about delayed harms associated with CT-related radiation exposure, an area of growing concern with more widespread use of CT for diagnostics and screening (112). The estimate of 1/1000 excess lifetime tumors in a 50-year-old after a single CT colonography examination is uncertain and could vary 2- to 3-fold. Radiation-related cancer risks could decrease if newer technologies reduce average radiation exposure (that is, from 10 mSv to about 5 mSv) (113). A recent survey of 22 institutions conducting CT colonography found a total median radiation dose per screening protocol of 5.6 mSv (range, 2.6 to 14.7 mSv) (114). Thus, because radiation doses depend on factors associated with the technology used and with decisions by the technician (112), higher radiation exposure might persist in some settings. Even assuming a 10-fold lower risk (1/10000 excess cancer risk), a recent modeling exercise (115) found that lifetime CT colonography screening (starting at age 50 years and repeated every 10 years) produced 36/100000 radiation-induced cases of cancer with 8 deaths, which offset some of the modeled mortality benefits from reductions in colonoscopy-associated complications.
Extracolonic findings that may require clinical follow-up occur relatively commonly (up to 1 in 4 asymptomatic persons undergoing CT colonography screening), with 7% to 16% clearly receiving recommendations for further diagnostic imaging tests or surgery (55, 67). Whether these extracolonic findings will ultimately provide additional benefit or harm to those undergoing CT colonography screening for colorectal cancer, and at what additional cost to the health care system, is unknown. A recent modeling study that attempted to address extracolonic findings found a net benefit (115), although the range of these findings was restricted to considering cancer and abdominal aortic aneurysms (reducing the estimated prevalence of extracolonic findings from <1% to at most 5% of the screened population). Other limitations and concerns about the assumptions underpinning this modeling exercise have been noted elsewhere (116).
The referral threshold for colonoscopy (size of lesions detected by CT colonography) is largely based on expert opinion rather than clinical outcomes. Most, but not all (109), experts currently suggest colonoscopy referral for a polyp 6 mm or greater. This makes referral to colonoscopy relatively common, with as many as 1 in 3 persons, to as few as 1 in 8, referred after CT colonography (Table 2). An ongoing nonrandomized comparative study of colonoscopy and CT colonography screening is offering patients with only 1 or 2 polyps 6 to 9 mm in size on CT colonography the option of CT colonography surveillance instead of immediate colonoscopy, under an institutional review boardapproved protocol (67, 117). Under this protocol, fewer patients (1 in 13) have been referred to colonoscopy, compared with referring all those with polyps 6 mm or greater (1 in 8). The safety of this approach is still being determined. Variability in polyp measurement due to differences among readers, CT measurement approaches, and viewing displays further complicates considerations of appropriate polyp size for colonoscopy referral after CT colonography examination (118120).
An important question for those considering implementing population colorectal cancer screening using CT colonography is whether test accuracy for this technology-dependent, operator-dependent test will be the same in nonresearch settings as in clinical studies. Studies on the accuracy of CT colonography have generally used an enhanced reference standard, which allows the separation of false-positive CT colonography results from false-negative colonoscopy results by reconciling differences with second-look colonoscopy. These studies have confirmed that colonoscopy and CT colonography miss adenomas and colorectal cancer, although reliable estimates of colonoscopy accuracy are limited by very small numbers of lesions. When considering the comparative accuracy between 2 operator-dependent technologies (CT colonography and colonoscopy), current studies are further limited by using designs that compared a larger number of experienced colonoscopists (5 to 50) to a much smaller number of experienced or very experienced radiologists (2 to 15).
As others have stated, Accurate CT colonography with high sensitivity and specificity for polyps 6 mm in size depends on meticulous technique (67). Differences in the experience and training of radiologist readers has been cited as the major factor underlying discrepant test accuracy estimates for CT colonography in nonscreening populations (121). Radiologists in nonacademic settings who read a validated set of 15 CT colonographies exhibited considerable individual variability in accuracy (53% to 93%) (122), consistent with our findings from 2 smaller CT screening studies comparing readers (53, 54), as well as from ACRIN, which used trained and certified readers (55). The challenges of adequately ensuring high-quality CT colonography readings are further illustrated by reports from ACRIN that half of the radiologists did not pass the initial certifying examination (after either 1.5 days of training or experience with 500 cases), although all did pass after further training (123). Clearly, specification, implementation, and monitoring of quality standards will be needed before widespread population screening with CT colonography. Activities are reported to be under way to upgrade quality metrics and training for CT colonography through the American College of Radiology (109).
Little is known about relative patient preferences for CT colonography compared with colonoscopy in average-risk screening populations, and preferences may differ from those of high-risk or symptomatic patients undergoing diagnostic CT colonography. Some data suggest that average-risk patients may prefer CT colonography for convenience, and slightly more (49.8%) would prefer CT colonography for future screening compared with those preferring colonoscopy (41.1%) (49). Issues about patient preferences will become particularly important once considerations of benefits, harms, and community accuracy are resolved. At that point, patient acceptability should also consider the 2-step process (CT colonography followed by referral colonoscopy as needed), with a second bowel preparation for colonoscopy potentially required. Same-day colonoscopy may make repeated bowel preparation unnecessary but requires coordination between radiology and gastroenterology services (124).
Availability of accurate CT colonography screening examinations that do not require any (or full) bowel preparation could greatly influence patient preferences and willingness to be screened (125, 126).
Accuracy and Harms with Colonoscopy and Flexible Sigmoidoscopy in Community Settings
Colonoscopy has presumed accuracy given its position in the diagnostic evaluation of patients screened by other colorectal cancer methods, although gastroenterologists have explicitly recognized that accuracy is highly dependent on the quality of the bowel preparation and endoscopic examination (127). Recent CT colonography studies using an enhanced standard of repeating colonoscopy examination for discordant colonoscopyCT colonography findings have confirmed that screening colonoscopy can miss colorectal tumors as well as adenomas. Related data from tandem colonoscopy in diagnostic or high-risk screening populations suggest reasonably low miss rates for large adenomas (2.1% [CI, 0.3% to 7.3%]) (128); similarly, new or missed colorectal tumors occurred in 3.4% of a population-based cohort (n= 12487) who had previously undergone colonoscopy for any reason up to 3 years before a new diagnosis of colorectal cancer (129). Although available studies do not precisely estimate the risk for missed lesions with screening colonoscopy, all underscore the importance of quality initiatives for the performance of colonoscopy or any operator-dependent technological screening tool (127).
Colonoscopy presents a higher risk for immediate harms than do other tests. Serious harms from community endoscopies are about 10 times more common with colonoscopy (2.8 per 1000 procedures) than with flexible sigmoidoscopy (3.4 per 10000 procedures). The estimates for harms from flexible sigmoidoscopy, however, have much wider CIs. Age-specific harm rates were sought but could not be determined.
Limitations
We reviewed the accuracy and harms of newer colorectal cancer screening tests as potential replacements for currently recommended tests. The USPSTF commissioned a separate, simultaneous decision analysis comparing different colorectal cancer screening programs to consider tradeoffs in test accuracy, repeated screening, and starting and stopping ages. Because of the targeted nature of this review, we did not formally update or address test acceptability (preferences, costs, adherence) issues; however, the importance of these issues for new technologies, such as CT colonography, may be considered as secondary to establishing the accuracy, harms, and community performance of the screening tests.
Conclusion
Some newer fecal screening tests with better sensitivity and similar specificity are reasonable substitutes for Hemoccult II testing to improve annual or biennial fecal screening programs for colorectal cancer. Modeling can help determine tradeoffs in fecal tests with improved sensitivity but reduced specificity and to compare results from screening programs. Colorectal cancer screening with CT colonography in average-risk populations is likely to detect larger adenomas and colorectal cancers as well as colonoscopy does, but it is not clear that CT colonography is as sensitive for smaller adenomas (6 mm) or what proportion of positive CT colonography results will be false positive. We did not evaluate the clinical benefit of detecting smaller polyps in this report. In addition, uncertainties about potential radiation-related harms, the effect of extracolonic findings, and test performance in community settings still remain. Given potential harms and observed variability in test accuracy, emphasis on quality standards for implementation of any operator-dependent colorectal cancer screening tests appears prudent. Considerations about colorectal cancer screening are affected by its rapidly evolving clinical science base, by the ongoing evolution of colorectal cancer screening technologies, and by a marketplace that continues to change. Thus, frequent reconsideration of available evidence and updating of recommendations is warranted.
Appendix: Detailed Methods
Under guidance from the USPSTF, we created and received USPSTF approval for an analytic framework and key questions adapted from the 2002 USPSTF report (130). The scope of this targeted review differed from the 2002 USPSTF report in several ways:
1. We did not update the direct evidence that standard FOBT screening is effective in improving health outcomes, except in addressing longer-term follow-up from the original trials included in the 2002 report; this evidence was considered established for the 2002 and was foundational for the last recommendation.
2. We did not update evidence on colorectal cancer screening methods not recommended after the last review (such as digital rectal examination) or omitted from this review at the workplan stage by the USPSTF because of poor test performance characteristics (such as double-contrast barium enema). A single study (n= 580) from the previous 2002 evidence report found that double-contrast barium enema used as a surveillance method after adenomatous polypectomy (with comparison to colonoscopy as the gold standard) showed a sensitivity of only 48% (CI, 24% to 67%) for polyps larger than 10 mm. A more recent study in a high-risk screening and diagnostic evaluation population comparing double-contrast barium enema with both optical and CT colonoscopy showed similarly low sensitivity estimates for large polyps (131). Given its confirmed low sensitivity for the targets of screening (lesions 10 mm), double-contrast barium enema as a primary colorectal cancer screening test was removed from the review.
3. Systematic review of the adherence, acceptability, and feasibility the screening tests was not part of this updated report. Similarly, the USPSTF judged that a thorough review of cost-effectiveness analyses was beyond the scope of our review, particularly because the USPSTF was conducting a simultaneous decision analysis (24). The decision analysis focused on projected benefits to a cohort that began colorectal cancer screening at age 40 years or later for different screening strategies, different beginning and ending ages, and different intervals for rescreening after a normal test result, with varying screening test adherence (24). These 2 reports were used together by the USPSTF to make its updated recommendation on colorectal cancer screening, and they affected the scope of our updated evidence review.
Data Sources and Searches
We first searched PubMed, Database of Abstracts of Reviews of Effects, the Cochrane Database of Systematic Reviews, Institute of Medicine, National Institute for Health and Clinical Excellence, and Health Technology Assessment databases for recent systematic reviews (19992006) for all key questions. We also searched the National Guideline Clearinghouse, Institute of Medicine, and National Institute for Clinical Evidence Web sites for relevant reports.
For each key question, we used already synthesized literature to identify all appropriate primary studies to the extent possible, supplementing with new literature searches corresponding with the end-of-search windows of relevant good-quality systematic reviews and meta-analyses. We developed literature search strategies and terms for each key question (25), with search dates guided by existing systematic reviews (including the 2002 UPSPTF report) and the development of screening technology.
We conducted 5 separate literature searches, 1 for each key question (except that we combined searching for harms for key questions 3 and 3b, but conducted 2 separate combined harms searches) in both MEDLINE and the Cochrane Central Register of Controlled Trials. Although the searches were specifically designed for a particular key question, all abstracts were reviewed for inclusion in all key questions. All searches covered reports published through January 2008. For all key questions, we supplemented literature searches by reviewing bibliographies of relevant articles (including systematic reviews) and considering studies recommended by experts during and after peer review.
For key question 2a (accuracy of flexible sigmoidoscopy and colonoscopy), we found no systematic reviews conforming to our inclusion and exclusion criteria more recent than the 2002 USPSTF review and therefore searched MEDLINE and the Cochrane Library from January 2000 through January 2008 for primary literature.
Key question 2b (test performance characteristics of newer screening tests) covered 3 tests: CT colonography, fecal immunochemical tests, and fecal DNA tests. We found 11 systematic reviews relevant to newer colorectal cancer screening tests: 6 of CT colonography screening (27, 28, 132135), 3 of fecal DNA screening (29, 136, 137), and 2 of fecal immunochemical screening tests (31, 37). On the basis of their use of comprehensive search strategies, recent search dates (last search date at least within the last 3 years or no older than 2005), and use of quality assessment of articles as quality indicators, we selected 3 reviews (2 of CT colonography (27, 28) and 1 of fecal DNA testing (29) to substitute for a portion of the comprehensive search strategy necessary to locate primary studies for key question 2b (26). We searched MEDLINE and the Cochrane Library for additional primary studies of CT colonography and fecal DNA testing (January 2006 through January 2008) beginning after the latest systematic review search date. We considered all studies examining CT colonography screening in average-risk patients from the selected reviews (27, 28), supplemented by studies in average-risk patients located through our literature search; as a final check, we examined the included studies in other relevant systematic reviews of CT colonography. No additional eligible studies were identified. Although we found several reviews of fecal immunochemical tests (key questions 2 and 3b), none met our standards for methods and reporting. We therefore searched MEDLINE and the Cochrane Library from 1990, when these tests began to be described, through January 2008. We checked our search results against 2 systematic reviews located during our review process to supplement with any potentially relevant studies not already identified (31, 37).
For key questions 3a and 3b (harms of screening tests), we found no systematic reviews more recent than the 2002 USPSTF review and therefore searched MEDLINE and the Cochrane Library from January 2000 through January 2008 and coded abstracts from both approaches.
Study Selection
In total, we evaluated 3948 abstracts and 490 full-text articles. Abstracts and articles were reviewed against specified inclusion criteria (see below) and required agreement of 2 reviewers. Eligible studies reported on the performance of colorectal cancer screening tests (sensitivity and specificity) or health outcomes. We excluded studies that did not address average-risk populations for colorectal cancer screening, unless an average-risk subgroup was reported. We excluded casecontrol studies of screening accuracy because these may overestimate sensitivity as a design-related source of bias (30), a problem recently demonstrated clearly for FOBTs (31). To avoid biases related to reference standards, we excluded studies of test accuracy that incompletely applied a valid reference standard or used an inadequate reference standard (32). For CT colonography, we considered only technologies that were compared against colonoscopy in average-risk populations, used a multidetector (not single-detector) scanner (27), and reported per-patient sensitivity and specificity.
Quality Assessment and Data Abstraction
Two investigators critically appraised and quality-rated all eligible studies by using design-specific USPSTF criteria (see below) (33) supplemented by National Institute for Clinical Excellence (138) and Oxman and Guyatt (139) criteria for systematic reviews and QUADAS criteria for diagnostic accuracy studies (140). Only good-quality systematic reviews were used as sources for primary articles, and all poor-quality studies were excluded from the review. One investigator abstracted key elements of all included studies into standardized evidence tables. A second reviewer verified these data. Disagreements about data abstraction or quality appraisal were resolved by consensus. Evidence tables and excluded studies tables for each key question are available in the full report (25).
Data Synthesis and Analysis
We primarily report qualitative synthesis of the results for most key questions because of study heterogeneity. Results of key questions 2b and 3b were judged to be too heterogeneous in terms of populations, settings, and study designs for meta-analysis and were therefore qualitatively synthesized. The performance of screening tests is preferentially described per person (sensitivity and specificity), supplemented by per-polyp analysis (miss rates). Ninety-five percent CIs are reported when available.
Because of the stringency of our inclusion criteria for key question 3a (complications of endoscopy), which focused on estimates of harms in the community practice setting, the studies we included were thought to be clinically homogenous enough to allow pooling of complication rates. Meta-analysis was performed to estimate combined complication rates for major or serious bleeding, perforation, and total serious adverse events that require hospital admission or result in death, including perforation, major bleeding, severe abdominal symptoms, and cardiovascular events. Several studies reported that their patients experienced no adverse events, and therefore we used a logistic random-effects model (35, 36) to include studies without any adverse events and estimate the combined complication rates. The model was described briefly as follows.
Suppose that there are i= 1, , n studies and number of complications and total procedures are x i and n i for study i. Denote that the complication rate from each study is p i, then we have
where i is the random effects across studies and 2 estimates the heterogeneity among studies on the logit scale. The combined complication rate, pcom, would be estimated by
This model allows inclusion of studies with no adverse events, and the random effects incorporate variation among studies into the combined estimate. A P value less than 0.05 for 2 is considered to represent statistically significant heterogeneity.
Exploratory meta-regressions were conducted by using logistic random-effects models to examine the association of important study-level characteristics: study design; study setting by country; and population characteristics, including age range, and indication for endoscopy with complication rate. To do this, we need to add only one more term to equation (2) of the logistic random-effects model:
where zi represents any study-level characteristics from study i, and the association of this study characteristic with complication rate is investigated through 1.
The analysis was performed by using the NLMIXED procedure in SAS software, version 9.1 (SAS Institute, Cary, North Carolina), with the code listed in Appendix Table 3.
Review Oversight and Peer Review
The Agency for Healthcare Research and Quality funded this work, provided project oversight, and assisted with internal and external review of the draft evidence synthesis but had no role in the design, conduct, or reporting of the review. The authors worked with 4 USPSTF liaisons at key points throughout the review process to develop and refine the analytic framework questions, set the review scope, and resolve methodologic issues during the conduct of the review. A draft of the evidence synthesis was reviewed by 8 experts, including experts in the fields of gastroenterology and radiology, and several experts who have written systematic evidence reviews on one or more aspects of colorectal cancer screening.
Article and Author Information
-
Acknowledgment: The authors thank the following peer reviewers for the evidence report (alphabetical)James Allison, MD, Carrie Klabunde, PhD, Ted Levin, MD, Perry Pickhardt, MD, Margaret Piper, PhD, MPH, David Ransohoff, MD, Robert Smith, PhD, and Steve Woolf, MD, MPH; Oregon Evidence-based Practice Center staffKevin Lutz, MA, Taryn Cardenas, BA, Rebecca Newton-Thompson, MD, MPH, Elizabeth O'Connor, PhD, Mark Helfand, MD, MS, MPH, and Daphne Plaut, MLS; and Centers for Disease Control and Prevention staffLaura Seeff, MD.
-
Grant Support: This study was conducted by the Oregon Evidence-based Practice Center under contract to the Agency for Healthcare Research and Quality (contract HHSA-290-2007-10057-I-EPC3, task order 3).
-
Potential Financial Conflicts of Interest: None disclosed.
-
Requests for Single Reprints: Reprints are available from the Agency for Healthcare Research and Quality Web site (http://www.ahrq.gov/clinic/uspstfix.htm).
-
Current Author Addresses: Drs. Whitlock, Lin, and Liles and Ms. Beil: Kaiser Permanente Center for Health Research, Kaiser Permanente Northwest, 3800 North Interstate Avenue, Portland, OR 97227.
-
Dr. Fu: Oregon Health & Science University, 3181 SW Sam Jackson Park Road, Portland, OR 97239.
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.↵
- 94.↵
- 95.↵
- 96.↵
- 97.↵
- 98.↵
- 99.↵
- 100.↵
- 101.↵
- 102.↵
- 103.↵
- 104.↵
- 105.↵
- 106.↵
- 107.↵
- 108.↵
- 109.↵
- 110.↵
- 111.↵
- 112.↵
- 113.↵
- 114.↵
- 115.↵
- 116.↵
- 117.↵
- 118.↵
- 119.↵
- 120.↵
- 121.↵
- 122.↵
- 123.↵
- 124.↵
- 125.↵
- 126.↵
- 127.↵
- 128.↵
- 129.↵
- 130.↵
- 131.↵
- 132.↵
- 133.↵
- 134.↵
- 135.↵
- 136.↵
- 137.↵
- 138.↵
- 139.↵
- 140.↵
RSS Feeds













