Skip to main content

Table 3 Performance drop of models trained on single-center data and applied to unseen multi-center data, using non-robust and robust featues withs priors, averaged across class boundaries (lower is better). Listed as mean and 95% confidence intervals, calculated with the adjusted bootstrap percentile (BCa) method. The lowest drop is indicated in bold for each metric. Bal. Acc.: Balanced accuracy, Acc.: Accuracy

From: Radiomics for glioblastoma survival analysis in pre-operative MRI: exploring feature robustness, class boundaries, and machine learning techniques

Feature set

AUC drop

Bal. acc. drop

Acc. drop

Specificity drop

Sensitivity drop

F1 drop

Precision drop

Non-robust features

0.52 CI: [0.50,0.56]

0.40 CI: [0.26,0.45]

0.48 CI: [0.33,0.53]

0.80 CI: [0.70,0.88]

0.06 CI: [0.00,0.15]

0.38 CI: [0.24,0.50]

0.54 CI: [0.39,0.63]

Robust features, sequence prior

0.30 CI: [0.22,0.36]

0.26 CI: [0.03,0.35]

0.37 CI: [0.33,0.43]

0.40 CI: [−0.10,0.75]

0.18 CI: [0.00,0.34]

0.38 CI: [0.24,0.53]

0.51 CI: [0.37,0.65]

Robust features, hand-picked

0.32 CI: [0.27,0.36]

0.26 CI: [0.18,0.31]

0.33 CI: [0.27,0.37]

0.42 CI: [0.27,0.50]

0.16 CI: [0.02,0.37]

0.35 CI: [0.22,0.54]

0.48 CI: [0.35,0.66]