Skip to main content
  • Research article
  • Open access
  • Published:

Clinical application of machine learning models in patients with prostate cancer before prostatectomy



To build machine learning predictive models for surgical risk assessment of extracapsular extension (ECE) in patients with prostate cancer (PCa) before radical prostatectomy; and to compare the use of decision curve analysis (DCA) and receiver operating characteristic (ROC) metrics for selecting input feature combinations in models.


This retrospective observational study included two independent data sets: 139 participants from a single institution (training), and 55 from 15 other institutions (external validation), both treated with Robotic Assisted Radical Prostatectomy (RARP). Five ML models, based on different combinations of clinical, semantic (interpreted by a radiologist) and radiomics features computed from T2W-MRI images, were built to predict extracapsular extension in the prostatectomy specimen (pECE+). DCA plots were used to rank the models’ net benefit when assigning patients to prostatectomy with non-nerve-sparing surgery (NNSS) or nerve-sparing surgery (NSS), depending on the predicted ECE status. DCA model rankings were compared with those drived from ROC area under the curve (AUC).


In the training data, the model using clinical, semantic, and radiomics features gave the highest net benefit values across relevant threshold probabilities, and similar decision curve was observed in the external validation data. The model ranking using the AUC was different in the discovery group and favoured the model using clinical + semantic features only.


The combined model based on clinical, semantic and radiomic features may be used to predict pECE + in patients with PCa and results in a positive net benefit when used to choose between prostatectomy with NNS or NNSS.


Prostate cancer (PCa) is the second most commonly diagnosed malignancy in men and is the second leading cause of mortality from cancer [1]. Radical prostatectomy is a well-established treatment for managing localized PCa, and the goal is to achieve a negative surgical margin while preserving urinary continence and erectile function. As such, accurate preoperative staging is of great importance for guiding treatment [2].

Multiparametric magnetic resonance imaging (mpMRI), is the recommended imaging method for tumour detection and for differentiating advanced cancers with extracapsular extension (ECE) from localized disease [3, 4]. The use of mpMRI combined with traditional clinicopathological-based risk nomograms is recommended before prostatectomy to determinate the need for nerve-sparing surgery (NSS) and pelvic lymphadenectomy [2, 5, 6].

The MRI-based assessment of ECE reported in the literature by Mehralivand et al. [7], the European Society of Urogenital Radiology (ESUR) score [8], a subjectively measured Likert scale [9], and measurement of TCCL (tumour capsular contact length) were recently compared by Park et al. [10], in a group of 301 patients (43% with pathologic ECE). The study showed sensitivity between 68 and 82% for extraprostatic extension detection [10]. These MRI scoring schemes demonstrated fair diagnostic performance, substantial agreement and association with histopathologic tumour extension [11], however, considerable observer variabilities remain a significant challenge in utilising these mpMRI-based scores [10], [11].

Machine learning (ML) applications in patients with prostate cancer remain an active research area focusing primarily on automatic segmentation, detection and localization, and assessment of disease aggressiveness using mpMRI [12], [13]. At present, only a few studies have introduced ML models to predict pECE+ (presence of ECE in pathology specimens) in PCa staging by mpMRI. Most used combinations of radiomic features were extracted from MRI T2 weighted images with semantic and clinical features to predict ECE [14,15,16,17,18,19,20]. To the best of our knowledge, clinically accepted and validated algorithms to predict pECE + obtained from MRI features for use in preoperative PCa surgical decision-making have not been developed to date. The metric usually used to guide model selection is area under the curve (AUC), which does not account for the specific use case. Decision curve analysis (DCA) [21] is a method of assessing the clinical utility of ML predictive models because it enables assessment of the net benefit achieved when a predictive model is used in a specified scenario.

The objective of this study was to develop ML predictive models for use in PCa surgical decision-making for a specific clinical use-case: to choose between the use of nerve-sparing surgery (NSS) and non-nerve sparing surgery (NNSS) when performing prostatectomy in patients with PCa (Fig. 1).

Fig. 1
figure 1

Schema of prostatectomy types

Peri-prostatic nerves (arrow) near the prostate capsule are not dissected (blue line) and only the prostate and capsule are removed in NSS (nerve-sparing surgery) prostatectomy. In the NNSS (non-nerve-sparing surgery) the peri-prostatic nerves and some extracapsular area are removed (dashed line)


Participants characteristics

Two independent data sets were available: a) discovery–single institution, Hospital da Luz (HdL), N = 139, used for training (underwent the institutional MRI protocol-Supplements Table S1; b) validation–multi-centre (15 external institutions), N = 55, used for testing. This cohort was part of a previously published predictive model [22] without radiomics analysis (Fig. 2).

All participants included in this study (discovery and validation groups) underwent Robotic Assisted Radical Prostatectomy (RARP) with pathologically confirmed PCa on prostate biopsy and index lesion PIRADS > 2 (PI-RADS v2) on MRI. A uropathologist (JC) with 10 years’ experience analysed all surgically resected prostate gland specimens using the same protocol, including the determination of ECE status. Matched cases, correlated pathology and radiology results from the pathologist (JC) and radiologist (AG) were included.

Fig. 2
figure 2

Flowchart of study cohort selection

Data upload and curation

MRI DICOM images were pseudonymized and transferred to a research PACS based on the extensible Neuroimaging Archive Toolkit (XNAT) platform [23], which served as the principal repository for image curation and analysis.


One radiologist (AG, ten years’ experience) manually segmented the region of interest (index lesion) for all cases in both data sets. A second radiologist (MK, three years’ experience) independently segmented a randomized selection of 30 cases from the discovery dataset (stratified on lesion size) to enable radiomic feature reproducibility to be determined.

Radiomics extraction

T2W MR images were interpolated to standard voxel size (0.5 × 0.5 × 3 mm) and z-score image normalization was applied. The normalized image intensities were quantized to 64 bins using the built-in uniform quantization method in Pyradiomics [24] and 107 features were calculated.

Semantic and clinical features

The first radiologist (AG) classified the index lesion for both data sets using eight semantic features (Fig. 3). The second radiologist (MK) independently classified the semantic features for the whole discovery set to enable feature reproducibility to be determined. Both radiologists also classified each lesion as having measurable ECE (mECE), i.e., the presence of a clear periprostatic extension.

Three clinical features were obtained from the electronic patient record (EPR): Gleason score, prostate volume and PSA. AG classified the index lesion in accordance with PIRADS-v2 [25]. Due to the small sample size, Gleason score was grouped into two classes based on tumour aggressiveness: low = with Gleason score of 6(3 + 3) or 7(3 + 4); high = including cases with a Gleason score of 7(4 + 3) or above. The PIRADS score was treated as a categorical variable to enable the predictive model to fit a non-proportional effect to this feature.

Fig. 3
figure 3

MRI Semantic features for detection of ECE+

This figure illustrates the eight semantic features, interpreted by radiologists, used in semantic model to predict pECE+, on axial T2WI. The measurable ECE was not used in semantic model and it is considered alone in another model as explained in the text

Feature reproducibility

Inter-observer variability was assessed using the intraclass correlation coefficient (ICC) [26], for radiomics and continuous semantic features (lesion size and tumour capsular contact length (TCCL), and Cohen’s kappa for the binary semantic features. Radiomic features with ICC > 0.75 were used for model building and the remaining features were discarded. All semantic features were used for model building, and their reproducibility estimates were used to identify features which are most likely to adversely impact the stability of the ECE status predictions, and therefore which features would benefit most from further standardization efforts.

Model discovery and validation

Models were built from the discovery data using the following four combinations of the three feature sets: i)clinical; ii) CS:clinical + semantic; iii) CR:clinical + radiomics; iv) CSR:clinical + semantic + radiomics. All combinations include the clinical features because they are routinely obtained for all participants as part of their standard of care.

For the two models that include radiomics features, a hierarchical feature reduction scheme [27] was used to remove correlated features with Spearman’s correlation > 0.9.

Models were built using logistic regression (LR), and LASSO regularization was used for feature selection in the three models that included semantic or radiomic features. The LASSO regularization parameter was tuned using 10-fold cross-validation (CV) over a log-spaced grid (20 values, 10− 4–104), and each input feature was z-score normalized. A fifth model (univariate LR) was built using the mECE feature, which enabled baseline ROC and DCA curves to be constructed.

Performance metrics for the discovery data set were estimated using a 10-fold CV repeated 100x, such that the parameter tuning CV was nested inside the performance estimation CV. Performance indicators included accuracy, F1-score, AUC, the ROC curve, and the DCA net-benefit curve [21], and these were computed for each of the outer CV splits and averaged to generate the final values and plots. The DCA net-benefit curves were used to select the final model that was tested in the validation data. An interpretation of this model was obtained using SHAP [28] analysis (SHapley Additive exPlanations), which explains the model predictions by computing the contribution of each feature to the overall risk prediction for each patient. The DCA and ROC curves were calculated for the validation data using the final model. The model development pipeline is shown in Fig. 4.

Fig. 4
figure 4

Model development pipeline


Participants characteristics

Table 1 summarizes the clinical and semantic feature distributions of both data sets. There were no statistically significant differences between the discovery and validation data sets (p > 0.05), except smooth capsular bulging (p = 0.03). However, this feature was not selected in any of the models evaluated in the validation data. A majority of participants did not have ECE detected in their surgical specimens (74.1% and 65.5% in the discovery and validation groups), and we conclude that the populations and MRI examinations in the two data sets are comparable.

Table 1 Data distributions of the clinical and semantic features in the discovery and validation data sets. Binary semantic features have the counts for absent/present, (values in parentheses are percentages), and mean +/ sd is given for continuous features. P-values comparing the discovery and test distributions are computed using Fisher’s exact test for binary features (and PIRADS), and unpaired t-tests are used for continuous features

Model performance comparisons

Model performance metrics (AUC, accuracy and F1 score) are given in Table 2 and the ROC and DCA curves are shown in Fig. 5. for the discovery and validation data. As previously mentioned, model selection was determined based on the DCA curves in the discovery data (Fig. 5b). Up to a threshold of 0.3, the net benefit of the CSR model (red line) is higher than the three other multivariate models and the univariate model derived from mECE. This selection of the CSR model as the final model is different from what would be obtained by using the performance metrics in Table 2, where the clinical + semantic model has higher values for all performance metrics compared to the other three multivariate models. The baseline univariate model derived from mECE had higher accuracy and F1 score, but this was at the expense of a lower AUC.

Fig. 5
figure 5

ROC and DCA plots for the four multivariate predictive models for ECE+(blue, orange, green, red lines) and the univariate model derived from mECE (purple line) in participants with PCa. Panels (a), ROC and (b), DCA are for the discovery data set and panels (c) and (d) are for the validation data set, respectively. The DCA plots also include lines for the net benefit when all participants receive non-nerve-sparing surgery (NNSS) and when no participants receive NNSS (i.e. when all participants receive nerve-sparing surgery-NSS). The net benefit is equal to or higher than both lines for all models. The x-axis of the DCA plots is the threshold of the risk predicted by the model at which NNSS would be indicated. A vital aspect of the DCA concept is that this threshold is directly related to the ratio of the cost associated with false negative and false positive predictions– low values of the threshold correspond to the use case where failing to give NNSS (with curative intent) is more costly than the complications that may arise from using NNSS

Table 2 shows that the accuracy and F1 scores of the CSR model are somewhat lower in the validation data. In contrast, the AUC is in fact higher for the validation data. Whilst this elevation is unusual, it is reasonable since the performance metrics are derived from two patient samples and are therefore influenced by random fluctuations related to patient variability. Although the validation AUC for the CSR model (0.928) is higher than the average AUC in the discovery data (0.880), it is smaller than 37% of the values from the 1000 cross-validation splits used to obtain the discovery mean AUC estimate.

Table 2 Performance metrics for the five predictive models in the discovery and validation data sets. Error limits are +/- 1 standard deviation across 1000 CV splits

Model explanation via SHAP analysis

Figure 6 shows the SHAP beeswarm plot for the CSR model, where the most influential features (based on the average SHAP value across all participants) are at the top of the plot. For the top five features in this plot, high positive SHAP values are associated with high feature values, which indictes an increased risk of pECE + for participants with high Gleason scores, longer TCCL and positive findings for Irregular contour, retoprostatic angle obliteration and capsular disruption. TCCL was the reproducible semantic feature (supplements Table S2). Prostate volume appeared to have a protective effect (larger values are associated with lower ECE risk), and the clinical features PSA and PI-RADS score were not present in the model. Three radiomics features appeared in the model– the two first-order features (10Percentile and Minimum) indicated increased pECE + risk for lower values. In contrast, the shape feature (MeshVolume, i.e. the lesion volume) suggested a more significant pECE + risk for larger lesion volumes, and all three radiomics features were highly reproducible (supplements Table S3). None of the second-order (texture) radiomics features were present in the model.

Fig. 6
figure 6

Beeswarm plot of SHAP values for the final model developed using clinical + semantic + radiomic features, which represents the influence of each feature when predicting pECE+. Blue dots imply low values for each feature, while red dots indicate high values, and positive SHAP values suggest a risk increase of pECE+, and vice versa for negative SHAP values


We built five machine learning predictive models to detect pECE + and compared them for selecting NNSS if ECE + is predicted, as this has a better chance of controlling the disease than NSS. The five models were built using clinical tools and semantic features previously described by the first author [22], with the addition of new radiomics features derived from MRI images: i) clinical; ii) CS:-clinical + semantic; iii) CR-clinical + radiomics; iv) CSR-clinical + semantic + radiomics (according to an adequate pipeline criterion and with inter-reading agreement) and lastly v) univariate measurable ECE model. The CS model achieved the best AUC results in the discovery set and the CSR model was almost as good as CS in the discovery set (Table 2). This CSR model maintains good performance in the validation data, and has the advantage that radiomics features were included which were reproducible with ICC agreement > 75% between readers. From all festures in the CS, the TCCL achieve the best ICC (0,683) reproducible between readers. Our results align with previous ones supported in the literature, which proved that combining radiomics, clinical and semantic models to predict pECE + is more accurate than individual models [14, 15, 29,30,31]. This paper follows the previously published work by the lead author [22], where clinical + semantic features were used to develop a predictive model based on a classic logistic regression algorithm to predict pECE + with a good performance (AUC 90%). Based on ML methdology, the main clinical and semantic predictive features obtained were GS > (3 + 4) and TCCL, similar to the previously published results [22, 29, 30]. Furthermore, with the addition of a radiomics signature, we improved the reproducibility, reducing the subjective nature of the previous model, which relied on MRI conventional visual interpretation by radiologists.

At present, predictive signatures to detect pECE + have been published but these have not been considered against surgical decision making [29]. , [32] In this study, we have gone further to examine how our model could perform in real life and quantify the potential impact of using it to choose between NSS versus NNSS. Most surgeons advocate NSS for patients with pECE- to achieve lower morbidity from nerve damage, such as incontinence and erectile dysfunction, keeping high negative surgical margins (NSM). While patients with pECE + would benefit from NNSS to achieve NSM, despite the increase risk of morbidity from nerve damage and other surgical side-effects. The DCA method was used to compare the net benefit of all five predictive models to detect pECE+, also comparing to the “treat all” case (i.e., treating all patients with NNSS) and “treat none”(i.e., treating no patients with NNSS, meaning treat all patients with NSS as the default treatment), see Fig. 5. The threshold probability (x-axis) in this plot encapsulates consideration of the potential surgery side-effects caused by NNSS versus the possibility of having positive surgical margins and disease recurrence, which ultimately depends on the surgeon and patient preference. The net benefit value quantifies the consequences of false positives (FP) and false negatives (FN) in relation to benefit and harm.

In the DCA analysis the risk of side-effects is increased as a consequence of using the model compared to always using the NSS strategy, but the success rate of the surgery is not affected i.e. NNSS and NSS would both have similar chances of successful treatment in a patient that does not have ECE. The CSR model was considered the best model because it achieved the best (or equal) net benefit values for threshold probabilities less than 0.3 on the DCA plot. The assumptions behind the DCA methodology [21] imply that probability thresholds less than 0.3 are equivalent to the assertion that the cost of not using NNSS when ECE is present (i.e. risking failure to achieve curative surgery) is at least 2 1/3 times the cost of causing side-effects by the use of NNSS (2 1/3 = (1–0.3)/0.3). In real-world cases it is likely that this cost ratio would be judged to be larger than 2 1/3 (i.e. the appropriate probability threshold would be < 0.3), and Fig. 5 shows that the CSR model has superior performance over this range.

The mECE variable represents the assessment by radiologist of macroscopic visible extra-prostatic disease on the MR images, and by using this (binary) variable as input to a logistic regression, a model can be built to directly compare the ROC and DCA performance for mECE and the other models. The multivariate models that include semantic and/or radiomics features outperformed the univariate mECE model in terms of AUC (Table 2) and net benefit (for thresholds below 0.3, Fig. 5). In the case of the CS model, this suggests that guiding the radiological assessment by breaking the examination down into more specific factors (i.e. the semantic features) leverages the radiologist’s knowledge more effectively than cognitively summarizing these factors into an overall judgement on the presence of pECE.

Our study has some limitations, the sample size is small and the external validation was performed with external MRI examinations from other institutions, however, interpretated by the same radiologist and operated by the same surgeon. The predictive model is of clinical value to our institution and serves as pilot project, further work will include applying the predictive model to other institutions as the following step approach.


The combined clinical + semantic + radiomics model can be used to predict pECE + in patients with PCa and results in a positive net benefit when choosing between prostatectomy with NNS or NNSS.

Data availability

The datasets and models used and/or analysed during the current study are available from the corresponding author on reasonable request.



Area under the curve


Decision Curve analysis


Extracapsular extension


Non-nerve-sparing surgery


Nerve-sparing surgery


Extracapsular extension in the prostatectomy specimen


  1. Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J, Jemal A. Global cancer statistics, 2012. CA Cancer J Clin. 2015;65(2):87–108.

    Article  PubMed  Google Scholar 

  2. Talab SS, Preston MA, Elmi A, Tabatabaei S. Prostate cancer imaging. Radiol Clin North Am. 2012;50(6):1015–41.

    Article  PubMed  Google Scholar 

  3. Johnson LM, Turkbey B, Figg WD, Choyke PL. Multiparametric MRI in prostate cancer management. Nat Rev Clin Oncol. 2014;11(6):346–53.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Barentsz JO, Weinreb JC, Verma S, Thoeny HC, Tempany CM, Shtern F, et al. Synopsis of the PI-RADS v2 guidelines for multiparametric prostate magnetic resonance imaging and recommendations for use. Eur Urol. 2016;69:41–9.

    Article  PubMed  Google Scholar 

  5. Milonas D, Venclovas Z, Muilwijk T, Jievaltas M, Joniau S. External validation of Memorial Sloan Kettering Cancer Center nomogram and prediction of optimal candidate for lymph node dissection in clinically localized prostate cancer. Cent Eur J Urol. 2020;73(1):19–25.

    Google Scholar 

  6. Gandaglia G, Ploussard G, Valerio M, Mattei A, Fiori C, Fossati N, et al. A novel nomogram to identify candidates for extended pelvic lymph node dissection among patients with clinically localized prostate cancer diagnosed with magnetic resonance imaging-targeted and systematic biopsies. Eur Urol. 2019;75(3):506–14.

    Article  PubMed  Google Scholar 

  7. Mehralivand S, Shih JH, Harmon S, Smith C, Bloom J, Czarniecki M, et al. A grading system for the assessment of risk of extraprostatic extension of prostate cancer at multiparametric MRI. Radiology. 2019;290(3):709–19.

    Article  PubMed  Google Scholar 

  8. Shieh AC, Guler E, Ojili V, Paspulati RM, Elliott R, Ramaiya NH, et al. Extraprostatic extension in prostate cancer: primer for radiologists. Abdom Radiol. 2020;45(12):4040–51.

    Article  Google Scholar 

  9. Costa DN, Passoni NM, Leyendecker JR, de Leon AD, Lotan Y, Roehrborn CG, et al. Diagnostic utility of a likert scale versus qualitative descriptors and length of capsular contact for determining extraprostatic tumor extension at multiparametric prostate MRI. AJR Am J Roentgenol. 2018;210(5):1066–72.

    Article  PubMed  Google Scholar 

  10. Park KJ, Kim M, hyun, Kim JK. Extraprostatic tumor extension: comparison of preoperative multiparametric MRI criteria and histopathologic correlation after radical prostatectomy. Radiology. 2020;296(1):87–95.

    Article  PubMed  Google Scholar 

  11. Ahmed HU, El-Shater Bosaily A, Brown LC, Gabe R, Kaplan R, Parmar MK, et al. Diagnostic accuracy of multi-parametric MRI and TRUS biopsy in prostate cancer (PROMIS): a paired validating confirmatory study. The Lancet. 2017;389(10071):815–22.

    Article  Google Scholar 

  12. Choyke PL. Quantitative MRI or machine learning for prostate MRI: which should you use? Radiology. 2018;289:138–9.

    Article  PubMed  Google Scholar 

  13. Goldenberg SL, Nir G, Salcudean SE. A new era: artificial intelligence and machine learning in prostate cancer. Nat Rev Urol. 2019;16(7):391–403.

    Article  PubMed  Google Scholar 

  14. Li L, Shiradkar R, Leo P, Algohary A, Fu P, Tirumani SH, et al. A novel imaging based nomogram for predicting post-surgical biochemical recurrence and adverse pathology of prostate cancer from pre-operative bi-parametric MRI. EBioMedicine. 2021;63:103163.

    Article  CAS  PubMed  Google Scholar 

  15. Losnegård A, Reisæter LAR, Halvorsen OJ, Jurek J, Assmus J, Arnes JB, et al. Magnetic resonance radiomics for prediction of extraprostatic extension in non-favorable intermediate- and high-risk prostate cancer patients. Acta Radiol. 2020;61(11):1570–9.

    Article  PubMed  Google Scholar 

  16. Xu L, Zhang G, Zhao L, Mao L, Li X, Yan W, et al. Radiomics based on multiparametric magnetic resonance imaging to predict extraprostatic extension of prostate cancer. Front Oncol. 2020;10:1–9.

    Google Scholar 

  17. Ma S, Xie H, Wang H, Yang J, Han C, Wang X, et al. Preoperative prediction of extracapsular extension: radiomics signature based on magnetic resonance imaging to stage prostate cancer. Mol Imaging Biol. 2020;22(3):711–21.

    Article  CAS  PubMed  Google Scholar 

  18. Stanzione A, Cuocolo R, Cocozza S, Romeo V, Persico F, Fusco F, et al. Detection of extraprostatic extension of cancer on biparametric MRI combining texture analysis and machine learning: preliminary results. Acad Radiol. 2019;26(10):1338–44.

    Article  PubMed  Google Scholar 

  19. Wang J, Wu CJ, Bao ML, Zhang J, Shi H, Bin, Zhang YD. Using support vector machine analysis to assess PartinMR: a new prediction model for organ-confined prostate cancer. J Magn Reson Imaging. 2018;48(2):499–506.

    Article  PubMed  Google Scholar 

  20. Min X, Li M, Dong D, Feng Z, Zhang P, Ke Z, et al. Multi-parametric MRI-based radiomics signature for discriminating between clinically significant and insignificant prostate cancer: cross-validation of a machine learning method. Eur J Radiol. 2019;115:16–21.

    Article  PubMed  Google Scholar 

  21. Vickers AJ, Cronin AM, Elkin EB, Gonen M. Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers. BMC Med Inform Decis Mak. 2008;8(1):53.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Guerra A, Alves FC, Maes K, Joniau S, Cassis J, Maio R, et al. Early biomarkers of extracapsular extension of prostate cancer using MRI-derived semantic features. Cancer Imaging. 2022;22(1):74.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Marcus DS, Olsen TR, Ramaratnam M, Buckner RL. The extensible neuroimaging archive toolkit. Neuroinformatics. 2007;5(1):11–33.

    Article  PubMed  Google Scholar 

  24. van Griethuysen JJM, Fedorov A, Parmar C, Hosny A, Aucoin N, Narayan V, et al. Computational radiomics system to decode the radiographic phenotype. Cancer Res. 2017;77(21):e104–7.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Weinreb JC, Barentsz JO, Choyke PL, Cornud F, Haider MA, Macura KJ, et al. PI-RADS prostate imaging-reporting and data system: 2015, version 2. Eur Urol. 2016;69(1):16–40.

  26. Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155–63.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Doran SJ, Kumar S, Orton M, D’Arcy J, Kwaks F, O’Flynn E, et al. Real-world radiomics from multi-vendor MRI: an original retrospective study on the prediction of nodal status and disease survival in breast cancer, as an exemplar to promote discussion of the wider issues. Cancer Imaging. 2021;21(1):37.

    Article  PubMed  PubMed Central  Google Scholar 

  28. Lundberg S, Lee SI. A unified approach to interpreting model predictions. In: 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA; 2017. p. 1–10.

  29. Bai H, Xia W, Ji X, He D, Zhao X, Bao J, et al. Multiparametric magnetic resonance imaging-based peritumoral radiomics for preoperative prediction of the presence of extracapsular extension with prostate cancer. J Magn Reson Imaging. 2021;54(4):1222–30.

    Article  PubMed  Google Scholar 

  30. He D, Wang X, Fu C, Wei X, Bao J, Ji X, et al. MRI-based radiomics models to assess prostate cancer, extracapsular extension and positive surgical margins. Cancer Imaging. 2021;21(1):46.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Fan X, Xie N, Chen J, Li T, Cao R, Yu H, et al. Multiparametric MRI and machine learning-based radiomic models for preoperative prediction of multiple biological characteristics in prostate cancer. Front Oncol. 2022;12:1–12.

    Google Scholar 

  32. Cuocolo R, Stanzione A, Faletti R, Gatti M, Calleris G, Fornari A, et al. MRI index lesion radiomics and machine learning for detection of extraprostatic extension of disease: a multicenter study. Eur Radiol. 2021;31(10):7575–83.

    Article  PubMed  PubMed Central  Google Scholar 

Download references


Thank you to my residents, to my colleagues Jaime Calha, Tiago Castela and Prof Rui Maio and Prof Caseiro Alves that gave all the support in the radiology department´s tasks.


This work is supported by a PhD student scholarship (Adalgisa Guerra) it was granted as a scientific project by Hospital da Luz (ID LH.INV.F2019027).

Author information

Authors and Affiliations



AG: Study concepts/design, data collection and data analysis/interpretation. Manuscript preparation; HW: AI model preparation. MK: data collection. MO: Data modelling and analysis and manuscript preparation. DMK: review and guarantor of this article. NP: Data analysis and review. All the authors read and agreed with the final version of the manuscript.

Corresponding author

Correspondence to Adalgisa Guerra.

Ethics declarations

Ethics approval and consent to participate

This study was approved by the Ethics Committee of Nova Medical School, and it was granted as a scientific project by Hospital da Luz (ID LH.INV.F2019027). Written consents were obtained from all subjects involved in the study.

Consent for publication

Not applicable.

Competing of interest

The authors declare no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1: Table S1.

Standardised institutional MR image sequence parameters for Prostate Protocol at 3T. Table S2. The inter-reader agreement for MRI semantic features. Table S3. The inter-observer variability for radiomics features

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guerra, A., Orton, M.R., Wang, H. et al. Clinical application of machine learning models in patients with prostate cancer before prostatectomy. Cancer Imaging 24, 24 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: