ORIGINAL ARTICLE
Effect of Three Decades of Screening Mammography on
Breast-Cancer Incidence
Archie
Bleyer, M.D., and H. Gilbert Welch, M.D., M.P.H.
Share:
Abstract
Article
References
Citing Articles (1)
There are two prerequisites for screening to reduce
the rate of death from cancer.1,2 First, screening must advance the time of diagnosis
of cancers that are destined to cause death. Second, early treatment of these
cancers must confer some advantage over treatment at clinical presentation.
Screening programs that meet the first prerequisite will have a predictable
effect on the stage-specific incidence of cancer. As the time of diagnosis is
advanced, more cancers will be detected at an early stage and the incidence of
early-stage cancer will increase. If the time of diagnosis of cancers that will
progress to a late stage is advanced, then fewer cancers will be present at a
late stage and the incidence of late-stage cancer will decrease.3
In the United States, clinicians now have more than three decades of
experience with the widespread use of screening mammography in women who are 40
years of age or older. We examined the temporal effects of mammography on the
stage-specific incidence of breast cancer. Specifically, we quantified the
expected increase in the incidence of early-stage cancer and determined the
extent to which this has led to a corresponding decrease in the incidence of
late-stage cancer.
METHODS
Overview
We obtained trend data on the use of screening
mammography and the stage-specific incidence of breast cancer among women 40
years of age or older. To calculate the number of additional women with a
diagnosis of early-stage cancer (as well as the reduction in the number of women
with a diagnosis of late-stage cancer), we determined a baseline incidence
before screening, calculated the surplus (or deficit) incidence relative to the
baseline in each subsequent calendar year, and transformed data on the change
in incidence to data on nationwide counts.
We used the direct method to adjust the incidence
rates according to age in the U.S. standard population in the year 2000. All
analyses were performed with the use of either (SEER*Stat or Microsoft Excel
software. In an effort to make our method transparent, the data on
Surveillance, Epidemiology, and End Results (SEER) stage–specific incidence and
all calculations are provided in the Supplementary Appendix, available with the full text of this article at NEJM.org. Both authors
vouch for the completeness and accuracy of the reported data and analysis and
the fidelity of the study to the protocol.
Data Sources
We obtained trend data from the National Health
Interview Survey on the proportion of women 40 years of age or older who
underwent screening mammography.4,5 Trend data on incidence and survival rates were
obtained from the nine long-standing SEER areas6; these data
accounted for approximately 10% of the U.S. population.7 Annual estimates of the population of women 40
years of age or older were obtained from the U.S. Census.8
Stage at Diagnosis
We used SEER historic stage A as the foundation for
our categorization of early- and late-stage cancer. The four stages in this
system are the following: in situ disease; localized disease, defined as
invasive cancer that is confined to the organ of disease origin; regional
disease, defined as disease that extends outside of and adjacent to or
contiguous with the organ of disease origin (in breast cancer, most regional
disease indicates nodal involvement, not direct extension9); and
distant disease, defined as metastasis to organs that are not adjacent to the
organ of disease origin. We restricted in situ cancers to ductal carcinoma in
situ (DCIS), specifically excluding lobular carcinoma in situ, as done in other
studies.10 We defined early-stage cancer as DCIS or localized
disease, and late-stage cancer as regional or distant disease.
Baseline Incidence
The incidence data from the first year in which
breast-cancer incidence was recorded (1973) were almost certainly spuriously
low (which would bias our estimates of excess detection upward). The data from
the subsequent 2 years (1974 and 1975) were above average for the decade,
reflecting the sharp uptick in early detection after First Lady Betty Ford's
breast-cancer diagnosis.11Consequently,
we chose the 3-year period 1976 through 1978 to obtain our estimate of the
baseline incidence of breast cancer that was detected without mammography.
During this period, the incidence of breast cancer was stable and few cases of
DCIS were detected; these findings are compatible with the very limited use of
screening mammography.
Current Incidence and Removal of the Effect of Hormone-Replacement
Therapy
We based our estimate of the current incidence of
breast cancer on the 3-year period from 2006 through 2008. To eliminate the
effect of hormone-replacement therapy, we truncated the observed incidence each
year from 1990 through 2005 if it was higher than the estimate of the current
incidence (Table S2 and Fig. S1 in the Supplementary Appendix). In other words, we did not allow the annual incidence of DCIS to
exceed 56.5 cases, localized disease to exceed 177.5 cases, regional disease to
exceed 77.6 cases, and distant disease to exceed 16.6 cases (all expressed per
100,000 women) during the period from 1990 through 2005. Other researchers have
dated the end of the effect of hormone-replacement therapy at 2006.12 Thus, our approach was simply to remove all excess
incidence in previous years.
Estimates of the Number of Women Affected
Base-Case Estimate
For each year after 1978, we calculated the
absolute change in the incidence of early- and late-stage cancer relative to
the 1976–1978 baseline incidence (after removing the transient increase in
incidence associated with hormone-replacement therapy during the period from
1990 through 2005, as described above). To calculate the excess in the number
of women with a diagnosis of early-stage cancer detected on screening
mammography, we multiplied the absolute increase in incidence observed in a
given year by the number of women in the population who were 40 years of age or
older in the same year. We used a similar approach to calculate the reduction
in the number of women with a diagnosis of late-stage cancer. Finally, we
summed the data across the three decades.
Subsequent Estimates
The base-case estimate implicitly assumes that,
with the exception of the effect of hormone-replacement therapy, the underlying
incidence of breast cancer is constant. To make an inference about any other
changes in the underlying incidence, we examined incidence trends in the
portion of the population that generally did not have exposure to screening:
women younger than 40 years of age. In this age group, the SEER calculation for
the annual percent change from 1979 through 2008 was 0.25% per year (95%
confidence interval [CI], 0.04 to 0.47). To account for this growth, we
repeated our analysis, allowing our baseline incidence among women 40 years of
age or older to increase by 0.25% per year (applied to both early- and
late-stage disease). We called this estimate the “best guess.”
Finally, we wanted to provide estimates that were clearly biased in
favor of screening mammography — ones that would minimize the surplus diagnoses
of early-stage cancer and maximize the deficit of diagnoses of late-stage
cancer. First, we assumed that the underlying incidence was increasing at a
rate of 0.5% per year — twice as high as that observed among the population of
women who were younger than 40 years of age. We called this estimate the
“extreme” assumption. Second, in addition to the increase of 0.5% per year, we
revised the baseline incidence of late-stage breast cancer by using the highest
incidence observed in the data (113 cases per 100,000 women in 1985) — thereby
maximizing the deficit of diagnoses of late-stage cancer. We called this
estimate the “very extreme assumption.”
RESULTS
Changes in Incidence Associated with Implementation of Screening
Figure 1AFIGURE 1
Use of Screening Mammography and Incidence of
Stage-Specific Breast Cancer in the United States, 1976–2008. shows the
substantial increase in the use of screening mammography during the 1980s and
early 1990s among women 40 years of age or older in the United States. Figure 1A also shows
that there was a substantial concomitant increase in the incidence of
early-stage breast cancer among these women. In addition, a small decrease is
evident in the incidence of late-stage breast cancer. As shown in Figure 1B, there was little change in breast-cancer incidence among women who
generally did not have exposure to screening mammography — women younger than
40 years of age.

Table 1TABLE 1
Absolute Change in the Incidence of Stage-Specific
Breast Cancer among Women 40 Years of Age or Older after the Introduction of
Screening Mammography. shows the changes in the stage-specific annual
incidence of breast cancer over the past three decades among women 40 years of
age or older. The large increase in cases of early-stage cancer (from 112 to
234 cancers per 100,000 women — an absolute increase of 122 cancers per
100,000) reflects both detection of more cases of localized disease and the
advent of the detection of DCIS (which was virtually not detected before
mammography was available). The smaller decrease in cases of late-stage cancer
(from 102 to 94 cases per 100,000 women — an absolute decrease of 8 cases per
100,000 women) largely reflects detection of fewer cases of regional disease.
If a constant underlying disease burden is assumed, only 8 of the 122
additional early diagnoses were destined to progress to advanced disease,
implying a detection of 114 excess cases per 100,000 women.Table 1 also shows
the estimated number of women affected by these changes (after removal of the
transient excess cases associated with hormone-replacement therapy). These
estimates are shown in terms of both the surplus in diagnoses of early-stage
breast cancers and the reduction in diagnoses of late-stage breast cancers —
again, under the assumption of a constant underlying disease burden.

Overdiagnosed Cancer and Effect of Screening on Regional and Distant
Disease
Table 2TABLE 2Four Estimates of the Excess Detection
(Overdiagnosis) of Breast Cancer Associated with Three Decades of Screening
Mammography, 1979–2008. shows the effects of relaxing the assumption of a
constant underlying disease burden on the estimate of the number of women with
cancer that was overdiagnosed. The base-case estimate incorporates the data in Table 1. In the best-guess estimate, it was assumed that the trend in the
underlying incidence was best approximated by the incidence observed among
women younger than 40 years of age (Figure 1B). This approach suggests that the excess detection attributable to
mammography in the United States involved more than 1.3 million women in the
past 30 years. In the extreme and very extreme estimates, it was assumed that
the underlying incidence was increasing at double the rate observed among women
younger than 40 years of age. Finally, in the very extreme estimate, it was
assumed that the incidence of late-stage cancer was the highest incidence ever
observed (thereby maximizing the deficit of diagnoses of late-stage cancer).
Regardless of the approach used, our estimate of
overdiagnosed cancers attributable to mammography over the past 30 years
involved more than 1 million women. In 2008, the number of women 40 years of
age or older with overdiagnosed cancers was more than 70,000 per year according
to the best-guess estimate, more than 60,000 per year according to the extreme
estimate, and more than 50,000 per year according to the very extreme estimate.
The corresponding estimates of the proportions of cancers that were
overdiagnosed are 31%, 26%, and 22%.
Figure 2FIGURE 2
Trends in the Annual Incidence of Late-Stage Breast
Cancer and Its Two Components (Regional and Distant Disease) among U.S. Women
40 Years of Age or Older, 1976–2008. shows the trends in regional and distant late-stage
breast cancer. The variable pattern in late-stage cancer (which includes the
excess diagnoses associated with hormone-replacement therapy in the late 1990s
and early 2000s) was virtually entirely attributable to changes in the
incidence of regional (largely node-positive) disease. The incidence of distant
(metastatic) disease, however, has remained unchanged (95% CI for the annual
percent change, −0.19 to 0.14).

DISCUSSION
Screening
can result in both the benefit of a reduction in mortality and the harm of
overdiagnosis. Our analysis suggests that whatever the mortality benefit,
breast-cancer screening involved a substantial harm of excess detection of
additional early-stage cancers that was not matched by a reduction in
late-stage cancers. This imbalance indicates a considerable amount of
overdiagnosis involving more than 1 million women in the past three decades — and,
according to our best-guess estimate, more than 70,000 women in 2008
(accounting for 31% of all breast cancers diagnosed in women 40 years of age or
older).
Over the same period, the rate of death from breast
cancer decreased considerably. Among women 40 years of age or older, deaths
from breast cancer decreased from 71 to 51 deaths per 100,000 women — a 28%
decrease.6 This reduction in mortality is probably due to some
combination of the effects of screening mammography and better treatment. Seven
separate modeling exercises by the Cancer Intervention and Surveillance
Modeling Network investigators provided a wide range of estimates for the
relative contribution of each effect: screening mammography might be
responsible for as little as 28% or as much as 65% of the observed reduction in
mortality (the remainder being the effect of better treatment).13
Our
data show that the true contribution of mammography to decreasing mortality
must be at the low end of this range. They suggest
that mammography has largely not met the first prerequisite for screening to
reduce cancer-specific mortality — a reduction in the number of women who
present with late-stage cancer. Because the absolute reduction in deaths (20
deaths per 100,000 women) is larger than the absolute reduction in the number
of cases of late-stage cancer (8 cases per 100,000 women), the contribution of
early detection to decreasing numbers of deaths must be small. Furthermore, as
noted by others,14 the small reduction in cases of late-stage cancer
that has occurred has been confined to regional (largely node-positive) disease
— a stage that can now often be treated successfully, with an expected 5-year
survival rate of 85% among women 40 years of age or older.15,16 Unfortunately, however, the number of women in the
United States who present with distant disease, only 25% of whom survive for 5
years,15 appears not to have been affected by screening.
Whereas the decrease in the rate of death from breast cancer was 28%
among women 40 years of age or older, the concurrent rate decrease was 42%
among women younger than 40 years of age.6In other words, there was a larger relative
reduction in mortality among women who were not exposed to screening mammography
than among those who were exposed. We are left to conclude, as others have,17,18 that the good
news in breast cancer — decreasing mortality — must largely be the result of
improved treatment, not screening. Ironically, improvements in treatment tend
to deteriorate the benefit of screening. As treatment of clinically detected
disease (detected by means other than screening) improves, the benefit of screening
diminishes. For example, since pneumonia can be treated successfully, no one
would suggest that we screen for pneumonia.
Our finding of substantial overdiagnosis of breast cancer with the use
of screening mammography in the United States replicates the findings of
investigators in other countries (Table S5 in theSupplementary Appendix).
Nevertheless, our analysis has several limitations. Overdiagnosis can never be
directly observed and thus can only be inferred from that which is observed —
reported incidence. Figure 1 and Figure 2 are based on
unaltered, long-standing, carefully collected federal data that are generally
considered to be incontrovertible. Table 1 and Table 2, however, are based on assumptions that warrant a
more critical evaluation.
First, our results might be sensitive to the period (1976 through 1978)
that we chose to obtain data for the baseline incidence of breast cancer
(before mammography). If the period were expanded to begin with the first years
of SEER data (i.e., 1973 through 1978), the baseline incidence of early-stage
cancer would be slightly lower (0.9%) and the incidence of late-stage cancer
would be slightly higher (1.4%). These changes offset each other and have a
negligible effect on our estimates.
Second, our ability to remove the effect of hormone-replacement therapy
(Fig. S1 in theSupplementary Appendix) is
admittedly imprecise. Although there is general agreement that this effect had
largely ceased by 2006, its onset is not as discrete. We chose to cap the
incidence of each disease stage as far back as 1990. However, the pattern of
regional disease (Figure 2) suggests that the bulk of the effect of
hormone-replacement therapy probably began later, in the mid-1990s, such that
our assumption probably overcorrects for the effect of hormone-replacement
therapy.
Third, we were forced to make some assumptions about the pattern of the
underlying incidence — the incidence that would have been observed in the absence of screening. The simplest
approach was to assume that the underlying incidence was constant (the base
case). In our best-guess estimate, however, we posited that the underlying
incidence was that observed in the population of women without exposure to
mammography; this underlying incidence was increasing at a rate of 0.25% per
year. Our assumption of an increase of 0.5% per year (in the extreme and very
extreme estimates) was admittedly arbitrary. It was twice the rate of increase
observed among women younger than 40 years of age and was outside the 95%
confidence interval. Perspective on the uncertainty about the underlying
incidence, however, is provided in Figure 2. The finding of a stable rate of distant disease argues against
dramatic changes in the underlying incidence of breast cancer.
Fourth, our best-guess estimate of the frequency of
overdiagnosis — 31% of all breast cancers — did not distinguish between DCIS
and invasive breast cancer. Our method did not allow us to disentangle the two.
We did, however, estimate the frequency of overdiagnosis of invasive breast
cancer under the assumption that all cases of DCIS were overdiagnosed. This
analysis suggested that invasive disease accounted for about half the
overdiagnoses shown in Table 2 and that
about 20% of all invasive breast cancers were overdiagnosed; these findings
replicate those of other studies.19
Finally, some investigators might point out that
our best-guess estimate of the frequency of overdiagnosis — 31% — was based on
the wrong denominator. Our denominator was the number of all diagnosed breast
cancers. Many investigators would argue that because overdiagnosis is the
result of screening, the correct denominator is screening-detected breast
cancers. Unfortunately, because the SEER program does not collect data on the
method of detection, we were unable to distinguish screening-detected from
clinically detected cancers. Self-reported data from the National Health
Interview Survey, however, suggest that approximately 60% of all breast cancers
were detected by means of screening in the period from 2001 through 2003.20
Breast-cancer overdiagnosis is a complex and
sometimes contentious issue. Ideally, reliable estimates about the magnitude of
overdiagnosis would come from long-term follow-up after a randomized trial.21 Among the nine randomized trials of mammography,
the lone example of this is the 15-year follow-up after the end of the Malmö
Trial,22 which showed that about a quarter of
mammographically detected cancers were overdiagnosed.23 Unfortunately, trials also provide a relatively
narrow view involving one subgroup of patients, one research protocol, and one
point in time. We are concerned that the trials — now generally three decades
old — no longer provide relevant data on either the benefit with respect to
reduced mortality (because treatment has improved) or the harm of overdiagnosis
(because of enhancements in mammographic imaging and lower radiologic and
pathological diagnostic thresholds).
Our investigation takes a different view, which
might be considered the view from space. It does not involve a selected group
of patients, a specific protocol, or a single point in time. Instead, it
considers national data over a period of three decades and details what has
actually happened since the introduction of screening mammography. There has
been plenty of time for the surplus of diagnoses of early-stage cancer to
translate into a reduction in diagnoses of late-stage cancer — thus eliminating
concern about lead time.24 This broad view is the major strength of our study.
Our study raises serious questions about the value of screening
mammography. It clarifies that the benefit of mortality reduction is probably
smaller, and the harm of overdiagnosis probably larger, than has been
previously recognized. And although no one can say with certainty which women
have cancers that are overdiagnosed, there is certainty about what happens to
them: they undergo surgery, radiation therapy, hormonal therapy for 5 years or
more, chemotherapy, or (usually) a combination of these treatments for
abnormalities that otherwise would not have caused illness. Proponents of
screening should provide women with data from a randomized screening trial that
reflects improvements in current therapy and includes strategies to mitigate
overdiagnosis in the intervention group. Women should recognize that our study
does not answer the question “Should I be screened for breast cancer?” However,
they can rest assured that the question has more than one right answer.
Disclosure forms provided by
the authors are available with the full text of this article at NEJM.org.
We thank Lynn Ries, M.S., of the Surveillance Research Program, Division
of Cancer Control and Population Sciences, National Cancer Institute, for her
help in analyzing Surveillance, Epidemiology, and End Results data.
SOURCE INFORMATION
From the Quality Department, St. Charles Health System, Central Oregon,
and the Department of Radiation Medicine, Oregon Health and Science University,
Portland (A.B.); the University of Texas Medical School at Houston, Houston
(A.B.); and the Dartmouth Institute for Health Policy and Clinical Practice,
Geisel School of Medicine at Dartmouth, Hanover, NH (H.G.W.).
Address reprint requests to Dr. Bleyer at 2500 NE Neff Rd., Bend, OR
97701, or at ableyer@gmail.com.