Changes in Reader Performance During Sequential Reading of Breast Cancer Screening Digital Breast Tomosynthesis Examinations
Changes in Reader Performance During Sequential Reading of Breast Cancer Screening Digital Breast Tomosynthesis Examinations
Academic Background
Breast cancer is one of the most common cancers among women worldwide, and early screening is crucial for improving cure rates. Traditional digital mammography (DM) is the primary tool for breast cancer screening, but it has limitations in detecting lesions in overlapping breast tissue. In recent years, digital breast tomosynthesis (DBT) has emerged as an important tool for breast cancer screening. By generating three-dimensional images of the breast, DBT can more clearly display breast tissue, reducing misdiagnosis and missed diagnoses caused by tissue overlap. Multiple clinical observational studies have shown that DBT outperforms traditional DM in breast cancer screening (1-6). As a result, many medical institutions have invested heavily in the adoption and upgrading of DBT equipment.
However, with the widespread use of DBT in screening, researchers have begun to focus on factors that influence screening performance, particularly changes in reader performance during batch reading. Previous studies have shown that batch reading of mammograms is subject to sequential effects, meaning that reader performance varies over the course of a batch. For example, Taylor-Phillips et al. observed in a large clinical trial in England that the average recall rate decreased from 6.4% to 4.6% over the first 40 examinations in a batch, with no corresponding change in cancer detection rate (8,9). Similar results were also observed in a Norwegian study (10). These studies suggest that readers may experience visual adaptation during the reading process, which affects their performance.
Although these studies provide important insights into sequential effects in batch reading, they are primarily based on full-field digital mammography (FFDM) data from European national screening programs. Screening practices in the United States differ significantly from those in Europe, particularly in terms of screening frequency and recall rates (14-16). Additionally, the introduction of DBT may further influence sequential effects in batch reading. Therefore, this study aims to evaluate changes in reader performance during batch reading by analyzing clinical DBT screening data.
Source of the Paper
This paper was co-authored by Craig K. Abbey, Andriy I. Bandos, Mohana K. Parthasarathy, Michael A. Webster, and Margarita L. Zuley. The authors are affiliated with the Department of Psychological and Brain Sciences at the University of California, Santa Barbara; the Department of Biostatistics at the University of Pittsburgh; the Department of Radiology at Magee-Womens Hospital of the University of Pittsburgh Medical Center; and the Department of Psychology at the University of Nevada, Reno. The paper was accepted on August 15, 2024, and published in Radiology, Volume 313, Issue 2.
Study Design and Methods
Study Design
This study is a retrospective, cross-sectional observational study using data from the Radiology Information System (RIS) at Magee-Womens Hospital of the University of Pittsburgh Medical Center. Data were collected from DBT screening examinations performed between January 2018 and December 2019, with the reference standard being pathology results or imaging findings at 1-year follow-up. The study excluded batches with fewer than three examinations, same-day result examinations, and radiologists with atypical reading patterns.
Data Collection
Through RIS queries, the researchers obtained the timing of each examination, patient age, breast density, and 1-year follow-up results. Cancer status was verified using the cancer registry system (MetriQ; Elekta). The recall rate, cancer detection rate, and the time interval between screening examinations were calculated for each radiologist.
Image Acquisition and Interpretation
All DBT examinations were reconstructed at 1-mm thickness and generated synthetic two-dimensional mammograms. Radiologists interpreted the results on a dedicated mammography workstation (Hologic), which automatically opened the next case. A batch was defined as a sequence of examinations with inter-examination time intervals of 10 minutes or less.
Statistical Analysis
The primary endpoints were recall rate and interpretation time across batch positions. Generalized linear mixed models (GLMM) were used for statistical analysis, accounting for patient characteristics (age, breast density), reading environment (weekday, time of day), and heterogeneity among radiologists.
Results
Sample Characteristics
The study ultimately included 121,652 screening examinations, including 1,081 cancer cases, interpreted by 15 radiologists. The median patient age was 61 years. The median number of examinations per batch was seven, with most batches containing no cancer cases (92%).
Recall Rate and Interpretation Time Analysis
The unadjusted false-positive rate was 15.5% for the first examination in a batch and decreased to 10.5% after three sequentially read examinations (p < 0.001), with no significant change in sensitivity (82.6% vs 84.2%; p = 0.15). Interpretation time decreased gradually within the batch, with the average interpretation time for non-cancer examinations decreasing from 2.8 minutes to 2.2 minutes (p < 0.001). Adjusted analysis showed that the changes in false-positive rate and interpretation time remained significant.
Discussion
This study reveals sequential effects in batch reading of DBT screening, showing that readers’ false-positive rates and interpretation times improve over the course of a batch. These changes are partly attributed to selection bias, as the first examination in a batch may include more complex or suspicious cases. However, adjusted analysis suggests that visual adaptation may also be an important mechanism explaining these effects.
Visual adaptation is a property of the perceptual system that adjusts sensitivity based on the current stimulus environment. Previous studies have shown that mammographic texture can induce visual adaptation, influencing readers’ perception and performance (11-13). In this study, the decrease in false-positive rate and interpretation time is consistent with the hypothesis of visual adaptation, suggesting that readers gradually adapt to the characteristics of the images during the reading process.
Conclusion
This study, by analyzing clinical DBT screening data, reveals changes in reader performance during batch reading. The improvement in false-positive rate and interpretation time is partly due to selection bias, but visual adaptation may also play an important role. These findings suggest that future screening practices could consider optimizing reading sequences to further improve DBT screening performance.
Highlights of the Study
- Significant Decrease in False-Positive Rate: The false-positive rate decreased from 15.5% to 10.5% during batch reading, and the change remained significant after adjustment.
- Reduction in Interpretation Time: The interpretation time for non-cancer examinations decreased from 2.8 minutes to 2.2 minutes, indicating improved efficiency during batch reading.
- Potential Role of Visual Adaptation: The study suggests that visual adaptation may be an important mechanism explaining changes in reader performance, providing new insights for future screening practices.
Value of the Study
This study provides important evidence for understanding sequential effects in batch reading of DBT screening, revealing patterns of change in false-positive rate and interpretation time. These findings not only help optimize screening workflows but also offer new directions for future research, particularly on the role of visual adaptation in medical image interpretation. Through further research, clinical practices can better utilize these findings to improve the accuracy and efficiency of breast cancer screening.