We constructed a de novo discrete-event simulation (DES) model to evaluate the cost-effectiveness of integrating AI for cancer detection in the NHS Breast Screening Programme. The model replicates the UK screening pathway, capturing individual screening and treatment trajectories along with the immediate and long-term benefits of earlier cancer detection, such as cancer stage shift, improved survival, reduced recurrence, and lower treatment costs. The evaluation followed National Institute for Health and Care Excellence (NICE) guidance (2025) [18] and the CHEERS-AI checklist (2024) [19], was conducted from the NHS and personal social services perspective, and considered lifetime costs and outcomes. The model was built in R (version 4.2.2). The model code is available upon request to the corresponding author and may be accessed only for academic and reproducibility purposes with approval from the study’s funder. All model assumptions, technical methods, and data sources are detailed in the appendix.

Four screening strategies were compared based on the configurations evaluated in the ScreenTrustCAD trial. Standard double reading by two radiologists, double reading by one radiologist plus AI, single reading by AI alone, and triple reading by two radiologists plus AI. The AI system used was Insight MMG version 1.1.6 (Lunit, South Korea), with technical details available in the original trial publication [17]. The DES model design was chosen for its ability to represent individual histories and timing of key events, capturing the complexity of population-based breast cancer screening pathways.

Model structure and overview

Figure 1 shows the model structure, detailing the sequence of clinical events and potential pathways from the initiation of screening to cancer detection and eventual mortality. The simulated cohort represents women eligible for routine screening. At the model’s start, each individual is assigned an age at death for other causes than breast cancer based on recent national life tables for adult women in England [20]. They are also assigned a cancer status based on the observed 9.05% probability of developing breast cancer between ages 50 and 74 in England [21]. Women are then invited for breast screening at age 50. Those who attend proceed to mammography, which is evaluated using one of the four screening strategies. Cancer detection under each strategy is determined by its respective sensitivity and specificity. The model then stratifies women into different pathways, in line with UK guidelines [6, 22]. Women without a cancer diagnosis are invited for routine screening every three years. Women diagnosed with breast cancer receive the same mammographic imaging modality as in routine screening, but enter a surveillance pathway in which they undergo annual mammography for ten years before returning to three-yearly screening. Screening in both the routine and surveillance pathways ceases once women reach the upper screening age of 71.

Fig. 1: Model structure and screening pathways.Fig. 1: Model structure and screening pathways.The alternative text for this image may have been generated using AI.

The figure shows the structure of the breast screening model, including entry age, screening strategies, screening outcomes, cancer detection, treatment/recovery, cancer death, and all-cause death.

We have described how a cancer can be detected at screening appointments. To represent cancers that present between screens or in individuals who do not attend, the model incorporates a natural history sub-model, illustrated in Fig. 2. For women who develop cancer, this includes assigning an age of symptomatic cancer detection. This is drawn from the empirical distribution of the age of screen-detected cancers in the English programme [23], with adjustment to reflect the average delay of 0.8 years between screen-detected and clinically detected presentation observed in recent NHS data [24]. After assigning the symptomatic detection age, the natural history model estimates the underlying tumour onset age by sampling the preclinical detectable phase from distributions reported in contemporary breast cancer natural history modelling and subtracting this duration from the symptomatic age [25]. During this preclinical phase cancers can be detected by screening earlier than they would have appeared symptomatically, creating a lead-time advantage [1] for screen detection. This structure produces individual-level variation in tumour onset and symptomatic presentation, which are specified independently of the screening strategy applied. After diagnosis, whether identified through screening or symptomatically, women transition into treatment pathways and the model allows for the possibility of recurrence for up to 25 years after diagnosis. Each woman is followed until death from breast cancer or other causes.

Fig. 2: Representation of cancer genesis, detection, and lead time.Fig. 2: Representation of cancer genesis, detection, and lead time.The alternative text for this image may have been generated using AI.

The figure illustrates the relationship between cancer genesis, screen detection, symptomatic detection, tumour presence period, false positives, and lead time within the model.

Model data sources and model assumptions

The model adopts a single-cohort design, with all women entering the simulation at a common starting age of 50. Screening attendance is modelled probabilistically based on age, invitation type (first or repeat), and screening history, using data from the NHS Breast Screening Programme Audit (2022–2023) [23] and the Age Trial [26]. UK baseline accuracy data for the standard screening strategy vary by age and breast density [27]. Breast density is assessed using the Volpara Density Grade [28] (VDG), an automated volumetric metric based on mammographic x-ray attenuation that assigns breasts to four ordered strata from almost entirely fatty to extremely dense. Diagnostic accuracy for the three AI-based strategies is derived from the Swedish ScreenTrustCAD trial [17]. We applied the relative accuracy changes in the ScreenTrustCAD trial to the matched age–density strata in the UK baseline data [27].

At cancer detection, tumour stage is assigned probabilistically as DCIS or invasive stage 1–4 using age- and detection-mode–specific distributions from UK audit data [29]. In this approach, the stage is determined indirectly through the natural history process. Cancers detected at routine screening follow the stage distribution observed for screen-detected cases, whereas cancers missed at screening, whether they later appear as interval cancers or present symptomatically, are assigned a more advanced stage distribution consistent with older age at diagnosis and the patterns seen in clinically detected disease. Improved mammography and reader performance reduce the number of missed cancers and shift detection to earlier ages and to screen-detected presentations. This indirect approach to stage assignment follows the methodology used in recent discrete-event simulation evaluations of changes to the UK Breast Screening Programme [30, 31].

After stage assignment, ten-year survival is sampled from probabilities by age, stage, and mode of detection drawn from the English breast cancer registry data [2]. Recurrence is sampled from probabilities that vary by stage and time since diagnosis, based on cohort studies of women in England with breast cancer [32, 33]. Annual recurrence is tracked up to 25 years for invasive cancer [32] and 20 years for DCIS [33]. The model assumes recurrent cancers do not appear at a lower stage than the original diagnosis, consistent with published studies [34,35,36].

The model was run probabilistically, with probability distributions assigned to all input parameters, including diagnostic accuracy, survival, health utility and costs (appendix, pp 51-53).

Costs and resource use

Costs were estimated in 2023 British Pounds from a UK payer perspective, discounted at 3.5% per NICE guidance [18]. Screening-related costs were costed using NHS reference costs [37]. The mammography imaging cost (£41 per screen in standard screening) reflects the technical cost of the mammogram, including equipment, technologist time, and the single embedded read. The tariff cost does not, however, cover the cost of screening invitations (£0.73 per invite), and the full staffing cost of image interpretation, which in standard practice involves two independent reads and, where these disagree, an additional consensus read. Diagnostic follow-up procedures include ultrasound (£68), biopsy (£373–£400), and MRI (£392). These tariffs include staff time for both the procedure and its interpretation, and therefore, their cost does not vary across screening strategies. AI-specific costs include a per-screen licensing fee (£2.02, as recommended by NICE [38]), as well as costs for IT infrastructure, governance, and staff time associated with implementation and oversight [39] (£3.89 per screen, based on NHS pilot data). Screen reader staffing costs for mammography were calculated using micro-costing methods based on national average salaries for radiologists (£113,962) and radiographers (£45,600), adjusted for role, region, experience, and locum use [40]. These were further weighted for actual staff mix, workload, and arbitration requirements, resulting in a per-read reader staffing cost of £25.11. Staffing costs varied by intervention. The single reader plus AI pathway uses one reader per screen, with a second reader only for AI disagreement. In contrast, the per-read costs are doubled in the standard screening pathway because every case is double-read and a third reader is used for disagreement. Cancer treatment costs were stratified by cancer stage and time since diagnosis using UK patient-level costing data [41, 42]. End-of-life care costs were determined by cause and age of death [43].

Health-related quality of life and clinical outcomes

Health outcomes are expressed as quality-adjusted life-years (QALYs) derived from the EQ-5D instrument, following NICE 2025 guidance [18], and discounted at 3.5% per year. Additionally, clinical outcomes measured include tumour stage at detection, cancer deaths, and the proportion of cancers detected through screening. To accurately capture health-related quality of life, baseline utilities for women of screening age are drawn from UK population norms [44, 45]. Utility decrements are applied for cancer stage and treatment (largest in year one), survivorship, terminal illness, and false-positive screening. Early-stage cancer decrements combine treatment-specific utility losses [46, 47] with English treatment distributions [23]. We also incorporate reductions in quality of life associated with later-stage breast cancer, longer-term survivorship, end-of-life care and a UK-specific disutility for false-positive screening [48,49,50].

Cost-utility analysis

We conduct a cost-utility analysis, with results expressed as net monetary benefit (NMB) to capture the value of each intervention in monetary terms while accounting for the opportunity cost of service change to the NHS [51]. In accordance with NICE 2025 guidance [18], NMBs are calculated at a willingness-to-pay threshold of £20,000 per QALY. When comparing multiple interventions, the option with the highest NMB at the threshold is considered most cost-effective. Strategies that reduce costs while improving health outcomes compared with standard screening are termed dominant [51]. Incremental cost-effectiveness ratios (ICERs), which express the additional cost per additional QALY gained, are reported for strategies that are not dominant relative to standard screening.

External validation and sensitivity analyses

External validation of the model was carried out using benchmarks from 2022 NHS breast screening data [23] (appendix, pp 41–42). Probabilistic sensitivity analysis (PSA) used parameter distributions detailed in the model documentation (appendix, pp 49-53). To minimise stochastic error, each run simulated 100,000 individuals, with 2000 PSA iterations balancing precision and computational feasibility. The adequacy of the number of PSA runs was assessed by reviewing the variance in NMB, as recommended by NICE guidance [18], to ensure that increasing the number of iterations would not substantially affect the results (appendix, pp. 49-50). Model uncertainty was also assessed with scatterplots and cost-effectiveness acceptability curves (CEACs), showing the probability each strategy is most cost-effective across willingness-to-pay thresholds [51]. An additional CEAC is presented for a scenario excluding the most cost-effective AI strategy, as its feasibility to policymakers is not known, thereby allowing a clearer comparison of how the alternative strategies perform in terms of cost-effectiveness. One-way deterministic sensitivity analyses incrementally increased the cost of AI per screen by up to double.