Evaluation of CT-based radiomics signature and nomogram as prognostic markers in patients with laryngeal squamous cell carcinoma

Background The aim of this study was to evaluate the prognostic value of radiomics signature and nomogram based on contrast-enhanced computed tomography (CT) in patients after surgical resection of laryngeal squamous cell carcinoma (LSCC). Methods All patients (n = 136) were divided into the training cohort (n = 96) and validation cohort (n = 40). The LASSO regression method was performed to construct radiomics signature from CT texture features. Then a radiomics nomogram incorporating the radiomics signature and clinicopathologic factors was established to predict overall survival (OS). The validation of nomogram was evaluated by calibration curve, concordance index (C-index) and decision curve. Results Based on three selected texture features, the radiomics signature showed high C-indexes of 0.782 (95%CI: 0.656–0.909) and 0.752 (95%CI, 0.614–0.891) in the two cohorts. The radiomics nomogram had significantly better discrimination capability than cancer staging in the training cohort (C-index, 0.817 vs. 0.682; P = 0.009) and validation cohort (C-index, 0.913 vs. 0.699; P = 0.019), as well as a good agreement between predicted and actual survival in calibration curves. Decision curve analysis also suggested improved clinical utility of radiomics nomogram. Conclusions Radiomics signature and nomogram showed favorable prediction accuracy for OS, which might facilitate the individualized risk stratification and clinical decision-making in LSCC patients.

great importance for improving risk stratification and optimizing therapeutic strategies in LSCC patients.
It is well known that the development, therapeutic response and prognosis of tumors are associated with the intratumoral heterogeneity, such as gene mutationexpression, cellular histology, angiogenesis and tumor microenvironment [7]. Recent studies have focused on the texture analysis of computed tomography (CT), which showed the potential prognostic value in several tumors, including lung cancer and liver cancer [8,9]. Image texture is a set of metrics calculated by numerous mathematical calculations, which provides information about the spatial arrangement and variation of pixel intensities in gray-scale images [10]. Moreover, the radiomics analyses are performed on image processing systems to estimate the texture features in CT images of tumors, and represent the intratumoral heterogeneity [11,12]. The nomogram, also known as alignment diagram, which uses several scale lines to represent multiple predictors, then expresses the interrelation between variables and calculates the probability of events based on multivariate regression model. In addition, published studies have suggested that radiomics nomogram had significant predictive accuracy for lymph node metastasis and survival outcomes in cancer patients [13,14].
Although the individual CT texture parameters have showed significant predictive role of survival and treatment failure in head and neck squamous cell carcinoma (HNSCC) [15,16]. The application of radiomics signature combined with multiple CT texture markers in LSCC patients has not been well discussed. Therefore, we aimed to build and validate whether the radiomics signature and relevant nomogram could be used as effective prognostic markers for overall survival (OS) in LSCC patients.

Patients
The medical records of LSCC patients between January 2011 to December 2015 were reviewed. All patients underwent surgical resection and pathological examination. The cancer staging was confirmed based on 8th edition AJCC-TNM stage [17]. We included patients who had preoperative contrast-enhanced CT. However, the CT images of non-identifiable tumors or image artifacts were excluded. Patients who underwent neoadjuvant chemoradiotherapy or being lost to follow-up were also excluded. Based on the random number, patients were divided into the training set and the validation set at a proportion of 7:3. The patients' demographics (age and gender), tumor characteristics (location, stage and histological grade) and treatments were compared between two cohorts. All the patients were followed up until death or last follow-up of December 2017. We analyzed OS as the endpoint, which meant the period from definite diagnosis to death or last follow-up.

CT imaging protocols and texture analysis
Contrast-enhanced CT images were obtained via a Philips Brilliance 16-slice CT scanner (Philips Medical System, US). All patients received intravenously nonionic contrast agent (1.5-2.0 ml/kg, iohexol; Beijing Beilu Pharmaceutical, China). This study used a free Java software called LIFEx (Orsay, France, http://www.lifexsoft. org) to extract radiomics features from multiple and consecutive CT images with 1 mm slice thickness [18]. The viewer of LIFEx supports the synchronized display of 3 directional slices (coronal, sagittal and transaxial) and maximum display of intensity projection. Images were labeled with random number and reviewed by blinded method. An independent experienced radiologists (HW) manually drew a region of interest (ROI) around the tumor border. The cervical lymph nodes were not involved. Finally, 36 texture features were extracted from LifeX, as described below [11].
(1) First order metrics: histogram and geometry-based features: skewness (degree of asymmetry of gray-level distribution), kurtosis (peakedness of distribution), entropy (disorder or randomness of pixel distribution), energy (homogeneity or uniformity of pixel distribution), sphericity (regularity of volume shape) and compacity (compactness of volume shape). (2) Second order metrics: graylevel co-occurrence matrix (GLCM): homogeneity (closeness of voxel pairs), entropy, energy, contrast (local variations), correlation (gray-level linear dependence) and dissimilarity (variation of voxel pairs). (3) Second order metrics: neighborhood gray-level dependence matrix (NGLDM): contrast (spatial rate change of intensity) and coarseness (difference of intensity between regions). (4) Third order metrics: gray-level run length matrix (GLRLM) and gray-level zone length matrix (GLZLM), which were calculated by a single co-occurrence matrix, then provide information about the size of homogenous runs for each gray-level directly in three dimensions.

Radiomics signature construction and validation
The least absolute shrinkage and selection operator (LASSO)-Cox regression algorithm was used to reduce the dimension of high-dimensional data in training dataset [19,20]. Then we calculated the radiomics signature (Rad-score) by linear combination of features weighted by LASSO coefficients. Three features with nonzero coefficients were selected (Fig. 1). There were high graylevel run emphasis (HGRE), long-run high gray-level emphasis (LRHGE) and zone length non-uniformity (ZLNU). The radiomics score was calculated according to the following formula: 0.000318616 × GLRLM_ HGRE+(1.83E-05) × GLRLM_LRHGE + 0.001307454 × GLZLM_ZLNU. The cut-off values of radiomics score were estimated by the area under the curve (AUC) of receiver operating curve (ROC). The Kaplan-Meier survival analysis evaluated the unadjusted association between Rad-score and survival outcome. Then we calculated the hazard ratio (HR) and related 95% confidence interval (CI) by univariate Cox regression analysis for each variable.

Radiomics nomogram building and assessment
Only significant variables in univariate Cox analyses were further included in nomogram for training cohort.
The radiomics nomograms containing radiomics signature and clinicopathologic risk factors were conducted on multivariate Cox regression model. The Harrell's concordance index (C-index) represented the consistence probability between the observed and predicted survival outcome, which was calculated by a bootstrap validation with 1000 re-samples. The C-index above 0.9   represents high accuracy, the value within 0.7 to 0.9 indicates moderate accuracy, and C-index of 0.5 suggests no predictive ability [21]. In addition, we applied calibration curves and Hosmer-Lemeshow tests to evaluate the goodness-of-fit [22]. These analyses were performed in both training cohort and validation cohort. Finally, the decision curve analysis (DCA) was conducted to measure the net benefits on threshold probabilities in validation group [23]. All above analyses were performed on R version 3.5.3, and p value< 0.05 was regarded as statistically significant.

Patient characteristics
There were 136 eligible patients (128 male and 8 female) in this study. The median age at the time of initial diagnosis was 60 years (range 30-86 years). There was no distant metastases (M0) in all patients at the first diagnosis. Adjuvant intensity-modulated radiotherapy (IMRT) and cisplatin chemotherapy were used. The median period of follow-up was 42 months (range 4-86 months). There were 20 patients (14.7%) died and 46 patients (33.8%) developed cancer progression. Twenty-seven, seven and four patients suffered local relapse, metastases of cervical lymph nodes and distant organs. The complete characteristics of the training cohort and validation cohort were described in Table 1. There was no significant statistical difference between two cohorts (all P > 0.05).

Assessment of radiomics signature
Through the LASSO-Cox analysis, three features with nonzero coefficients were selected to calculate the radiomics signature (Rad-score). No significant difference of Rad-score was found between the training cohort and validation cohort (P = 0.646). The AUCs were 0.783 (95%CI: 0.646-0.921, P = 0.001) for the training cohort and 0.770 (95%CI: 0.617-0.922, P = 0.037) for the validation cohort. The optimal cut-off points of Rad-score   4.534 in training cases and 4.283 in validation cases, then we classified patients into high-risk or lowrisk groups (Fig. 2a,b). It showed that high-risk groups had more patients in death status (Fig. 2c,d). And the heatmaps demonstrated that three texture features generally tended to be higher value in high-risk patients (Fig. 2e,f). The elevated radiomics signature was significantly related with worse OS in the training cohort (HR = 11.98, 95%CI: 2.68-53.56, p = 0.001; Fig. 3a) and the validation cohort (HR = 6.75, 95%CI: 1.35-33.70, p = 0.020; Fig. 3b).

Validation of radiomics nomogram
The clinical nomogram contained significant characteristics of tumor location, TN stage and laryngectomy types in the training set ( Table 2). The radiomics signature and above characteristics were further included in the radiomics nomogram (Fig. 4). Then we compared the Cindexes of different models with tumor staging as the reference ( Table 3). The radiomics nomogram model had higher accuracy compared with cancer staging in the training cohort (P = 0.009), which also revealed good predictive performance than cancer staging (P = 0.019) and clinical nomogram (P = 0.008) in the validation cohort. In training set, the calibration curves of 1-year and 3-year OS and nonsignificant Hosmer-Lemeshow test (P = 0.833; P = 0.706; Fig. 5a) showed good agreement between predicted and actual OS. Moreover, the radiomics nomogram performed well in the validation set (P = 0.952; P = 0.091; Fig. 5b). The DCA illustrated that when the threshold probability was approximately within 15 to 55%, the radiomics nomogram had a better net benefit for decision-making than other models (Fig. 6).

Discussion
In present study, we extracted texture features of preoperative contrast-enhanced CT images, and used machine-learning method to obtain a 3 features-based radiomics signature. This study extended the individual texture analysis to the survival assessment of radiomics based on multiple texture features. The results showed that radiomics signature was a potential prognostic  marker for OS. Moreover, the radiomics nomogram incorporating radiomics signature and other clinicopathological characteristics was practical in survival prediction, which had improved discrimination ability than traditional cancer staging in both training cohort and validation cohort. Therefore, it indicated that radiomics signature and nomogram had additional prognostic value for OS in patient with LSCC. Computer tomography (CT) is a widely applied instrument for noninvasive diagnosis and staging of laryngeal cancer before treatment. Plain CT images reflect the non-homogeneity of intratumoral tissue and cell density due to necrosis, hemorrhage and cystic degeneration [24,25]. Additionally, enhanced scans reflect the heterogeneity of vascular supply, with increased blood supply in some areas and decreased blood supply in others [26,27]. Based on CT texture analysis, intratumoral heterogeneity can be translated into the heterogeneity in spatial distribution of density pixels, which was related with pathological grade, tumor aggressiveness, tumoral biological index (e.g. hypoxia markers, VEGF) as well as prognosis and therapeutic response [28][29][30]. For head and neck cancer, most of previous studies separately evaluated the prognostic value of individual CT texture parameters. For example, the entropy and skewness were independently associated with OS in HNSCC patients undergoing TPF chemotherapy [15]. In histogram analysis, a pixel distribution with higher kurtosis, energy and entropy, and a positive or negative skewness indicated the enhancement of tumors heterogeneity [31,32]. The high homogeneity of PET-CT images also was revealed as predictors of progression-free survival in pharynx cancer [33]. In terms of third order metrics, significant differences of GLRLM features were found between regional control group and local recurrence group in HNSCC patients treated with chemoradiotherapy [16].
As mentioned, there were hundreds and thousands of texture features, and their prognostic roles were widely different in previous studies. Therefore, the analyses of single texture features were time-consuming and inefficient. Furthermore, considering the potential overfit-ting of radiomics features, it was meaningful to reduce and shrink features before model building. The integrative radiomics signature would facilitate the application of CT texture feature. In our study, the radiomics signature demonstrated favorable discriminative ability in the training cohort (AUC = 0.783, C-index = 0.782) and validation cohort (AUC = 0.770, C-index = 0.752) compared with previous studies (AUC = 0.66-0.69) [34]. In addition, an MRI-based radiomics study of HNSCC concluded moderate C-indexes of 0.73 in training cohort and 0.71 in validation cohort [35]. These results suggested that radiomics heterogeneity of primary mass observed in CT and MRI images might be helpful to judge prognosis and guide treatment in cancer patients. No final conclusion has yet been reached, which still needs more studies to determine whether radiomics signature could be effective predictor of prognosis or not.
The TNM staging was traditionally suggested as an independent prognostic predictor in HNSCC patients [36]. Primary site of LSCC was also associated with survival results. The prognosis of supraglottic laryngeal cancer was poorer than that of subglottic and supraglottic cancer, which possibly due to common cervical lymph node metastasis in supraglottic laryngeal cancer [37]. In our study, survival results of patients undergoing partial laryngectomy was better than patients undergoing total laryngectomy. It was probably because patients with earlystage cancer often received partial resection than total resection. Then we incorporated the radiomics signature and above clinical factors in nomogram to improve the survival prediction. It has been suggested that single risk factor without model may be difficult to comprehensively evaluate the postoperative outcome of different patients, thus a prognostic model is necessary to consider multiple risk factors for each patient, such as nomogram [14]. Previous MRI-based radiomics studies also showed significantly higher C-indexes of radiomics nomogram than TNM staging [34,38]. However, the effectiveness of radiomics nomogram still requires further investigation. There were several shortcomings in the present study. Firstly, due to retrospective design and small sample of single-center, the potential selection bias cannot be excluded, which limited the accuracy and reliability of results. Secondly, the variation between observed images should be considered when drawing the outline of ROI areas. The computer-aided software of this study may help to reduce variation to some degree. Thirdly, although there was no significant difference in characteristics between the two cohorts, the variables including therapeutic strategies and complications might act as potential confounders. Furthermore, there were many kinds of texture features and images processing software, thus unifying the texture analysis would help to achieve robust results and spread the application.

Conclusions
Contrast-enhanced CT radiomics signature was independently associated with overall survival in LSCC patients. The radiomics nomogram might act as a noninvasive and effective model to improve the individualized prognostic evaluation and treatment strategies. Therefore, more researches are warranted for better estimation, especially the large-scale prospective and multicenter studies.