Breast cancer detection accuracy of AI in an entire screening population: a retrospective, multicentre study

Cancer Imaging

Table 2 Detection accuracy analysis in both study scenarios

	Sensitivity (95% CI); p value*	Specificity (95% CI); p value*	PPV (95% CI); p value^†	NPV (95% CI); p value^†	Recall rate (95% CI); p value^†	Arbitration rate (95% CI); p value*
Standalone AI scenario
First reader	63.7 (61.6–65.8); ref.	97.8 (97.7–97.8); ref.	18.7 (17.8–19.6); ref.	99.7 (99.7–99.7); ref.	2.7 (2.6–2.8); ref.	NA
Standalone AI_sens	63.7 (61.6–65.8); >0.99	96.5 (96.4–96.5); <0.0001	12.6 (11.9–13.2); <0.0001	99.7 (99.7–99.7); 0.71	4.0 (3.9–4.1); <0.0001	NA
Standalone AI_spec	58.6 (56.5–60.8); <0.0001	97.8 (97.7–97.8); 0.95	17.4 (16.5–18.3); 0.01	99.7 (99.6–99.7); 0.0002	2.7 (2.6–2.7); 0.24	NA
AI-integrated screening scenario
Combined reading	73.9 (72.0-75.8); ref.	97.9 (97.9–98.0); ref.	22.0 (21.0–23.0); ref.	99.8 (99.8–99.8); ref.	2.7 (2.6–2.7); ref.	2.9 (2.8-3.0); ref.
Integrated AI_sens	76.2 (74.3–78.0); 0.0004	97.3 (97.2–97.3); <0.0001	18.1 (17.3–19.0); <0.0001	99.8 (99.8–99.8); 0.07	3.3 (3.3–3.4); <0.0001	5.1 (5.1–5.2); <0.0001
Integrated AI_spec	74.6 (72.6–76.4); 0.32	97.9 (97.8–97.9); 0.54	22.0 (21.0–23.0); 0.99	99.8 (99.8–99.8); 0.60	2.7 (2.6–2.7); 0.49	4.0 (3.9–4.1); <0.0001

Data are % (95% CI); p value. PPV = positive predictive value. NPV = negative predictive value. AI_sens=artificial intelligence score cut-off point matched at mean first reader sensitivity. AI_spec=artificial intelligence score cut-off point matched at mean first reader specificity. *p values were calculated using McNemar’s test. †p values were calculated using exact binomial test

ISSN: 1470-7330