Preoperative assessment of lymph node metastasis in Colon Cancer patients using machine learning: a pilot study

Background Preoperative detection of lymph node (LN) metastasis is critical for planning treatments in colon cancer (CC). The clinical diagnostic criteria based on the size of the LNs are not sensitive to determine metastasis using CT images. In this retrospective study, we investigated the potential value of CT texture features to diagnose LN metastasis using preoperative CT data and patient characteristics by developing quantitative prediction models. Methods A total of 390 CC patients, undergone surgical resection, were enrolled in this monocentric study. 390 histologically validated LNs were collected from patients and randomly separated into training (312 patients, 155 metastatic and 157 normal LNs) and test cohorts (78 patients, 39 metastatic and 39 normal LNs). Six patient characteristics and 146 quantitative CT imaging features were analyzed and key variables were determined using either exhaustive search or least absolute shrinkage algorithm. Two kernel-based support vector machine classifiers (patient-characteristic model and radiomic-derived model), generated with 10-fold cross-validation, were compared with the clinical model that utilizes long-axis diameter for diagnosis of metastatic LN. The performance of the models was evaluated on the test cohort by computing accuracy, sensitivity, specificity, and area under the receiver operating curve (AUC). Results The clinical model had an overall diagnostic accuracy of 64.87%; specifically, accuracy of 65.38% and 62.82%, sensitivity of 83.87% and 84.62%, and specificity of 47.13% and 41.03% for training and test cohorts, respectively. The patient-demographic model obtained accuracy of 67.31% and 73.08%, the sensitivity of 62.58% and 69.23%, and specificity of 71.97% and 76.23% for training and test cohorts, respectively. Besides, the radiomic-derived model resulted in an accuracy of 81.09% and 79.49%, sensitivity of 83.87% and 74.36%, and specificity of 78.34% and 84.62% for training and test cohorts, respectively. Furthermore, the diagnostic performance of the radiomic-derived model was significantly higher than clinical and patient-demographic models (p < 0.02) according to the DeLong method. Conclusions The texture of the LNs provided characteristic information about the histological status of the LNs. The radiomic-derived model leveraging LN texture provides better preoperative diagnostic accuracy for the detection of metastatic LNs compared to the clinically accepted diagnostic criteria and patient-demographic model.


Background
Colon cancer (CC) is a leading cause of cancer morbidity and mortality in the world with more than one million new cases in 2018 [1]. Despite advancements in treatment options of this disease, standard curative treatment is still complete resection of the primary tumor with dissection of regional lymph nodes (LNs) [2]. The presence of LN metastases plays a crucial role in the management and treatment strategy in CC [3,4]. In clinical practice, preoperative identification of the histological status of LNs provides the basis for accurate planning of surgery, which not only ensures the quality and quantity of LN dissection but also avoids the omission of suspected LNs. Also, the presence of metastatic LNs determines the potential benefit of neoadjuvant chemotherapy in select CC patients. The recent Foxtrot trial investigated the potential efficiency of neoadjuvant chemotherapy administered to CC patients with metastatic LNs [5]. Accurate preoperative detection of metastatic LNs, therefore, is critical for generating an effective and individualized treatment plan for CC patients.
Computed tomography (CT) is an initial diagnostic tool for the evaluation of CC disease in clinical examination [5]. Despite well performance for assessment of T-stage of tumor using this imaging modality, diagnostic accuracy for regional metastasis is only 54% for the patients using current diagnostic criteria based on size of the LNs (metastatic LNs > 10 mm) [6] which demonstrates the unreliability of using the size to evaluate regional metastases in CC patients [7]. Although the clinicopathological characteristics demonstrate the potential for diagnosis of LN metastases [8][9][10], the requirement of surgical resection or biopsy for confirmation limits their usage in clinical practice. Therefore, novel approaches that use conventional CT imaging data to detect regional metastases are needed to develop better preoperative treatment planning for CC patients.
Radiomics is an emerging translational field of research that aims to describe tissue characteristics extracting high throughput quantitative features or biomarkers from multi-modality medical imaging data [11,12]. Due to the ability to reveal complex intra-tumor heterogeneity, radiomics is considered to be a powerful tool in modern medicine including diagnosis, tumor characterization and prognosis [13]. In recent years, it has been used for the diagnosis of metastatic LNs in bladder, lung, biliary-tract, and esophagus cancers with satisfactory results [14][15][16][17]. However, a small number of these studies benefit from pathological verification and there is a paucity of quantitative analysis in CC to predict metastasis of regional LNs.
The purpose of this study was to develop and validate machine learning models that utilize either patientcharacteristics or quantitative CT texture features for preoperative accurate diagnosis of metastasis in regional LNs and to compare their performance with clinical diagnosis criteria for LNs in CC patients.

Patients
A total of 390 patients were selected from the patient database of single-institution for this retrospective study among 598 patients who were diagnosed with CC and received colectomy with the removal of regional LNs from January 2014 to May 2018 in the Affiliated Hospital of Qingdao University. The detailed patient recruitment pathway with inclusion and exclusion criteria is described in Fig. 1.
Clinical data, including age, gender, and primary tumor site, were collected by reviewing medical records. Histological grade, T-stages, nerve invasion, and vessel invasion were obtained directly from pathological reports. The stage of the tumors was determined by surgical oncologists according to the American Joint Committee on Cancer (AJCC) TNM staging system, 8th edition [18].

CT image acquisition
The CC patients were scanned before radical resection of colon tumors using Siemens Somatom Sensation 64 CT scanner (Siemens Medical Solutions, Erlangen, Germany). The CT imaging parameters were selected by radiologists and kept the same for all the patients which were 120 kV; 200 effective mAs; beam collimation of 64 mm × 0.6 mm; a matrix of 512 × 512; a pitch of 0.8; and a gantry rotation time of 0.5 s. The CT image data were reconstructed with a slice thickness of 5 mm and resampled using bspline interpolation to set the in-plane resolution to 1 mm before the feature extraction process.

Lymph node labeling and segmentation on CT data
After preoperative CT data acquisition, tumors and regional LNs were evaluated according to their anatomical regions as a standard procedure before the surgery. During the surgery, both tumor and LNs were removed and the dissected LNs were separated into different groups according to anatomical location. Following the surgery, all the resected LNs were sent to a pathology laboratory for histological analysis. The morphological and histological features of the LNs were recorded as metadata on patient records after pathological analysis. We generated a patient cohort using histology reports and preoperative CT data with LN markings to identify the location and the histological status of the LNs. In order to ensure the validity of the LN histology, the largest and most adjacent LNs to tumors were selected. Afterward, the largest LN from each patient was manually outlined to generate a region of interest (ROI) on the slice with maximal inplane diameter using ITK-SNAP software [19] by an experienced radiologist. Later, these ROIs were validated by a senior radiologist in abdominal radiology.

Feature extraction, selection, and model building
Before computing the features, CT data was quantized using a fixed number of bin sizes (8 bins) and rescaled into the range of [0,1] using min-max normalization. The quantitative CT image features were computed by employing six feature extraction methods, e.g. first-order statistics (FoS) (six features), gray-level co-occurrence matrix (GLCM) (six features), gray-level run-length matrix (GLRLM) (seven features), local binary pattern (LBP) (ten features), fractal analysis (FA) (one feature) and shape features (nine features). The FoS were computed to summarize the distribution of the intensity values of the CT image data regardless of spatial positioning [20]. Besides, GLCM features were computed for analysis of the tissue texture evaluating the spatial relationship of the voxels and GLRLM features were utilized to interpret coarseness of the texture by computing in four directions (0°, 45°, 90°, 135°) [21,22]. Afterward, GLCM and GLRLM features computed for each direction were merged by averaging into a vector. Local binary patterns were used to describe local spatial patterns of intensity images while fractal analysis was performed to measure the rate of changing complexity of the texture with scale variation [23,24]. The shape features were computed to interpret the structural characteristics of the tissues from generated ROIs. Besides, two image filters were utilized to capture the texture characteristics of the LNs in wavelet and gradient domains. The wavelet coefficient images were computed using Daubechies kernel function for analysis of the localized characteristics of the images at different scales while gradient images are the measurement of the directional changes of the image intensity [25]. FoS, GLRLM and GLCM features were computed from first level wavelet decomposition images (eighty-eight features). Besides, FoS, GLCM, and GLRLM features were also extracted from gradient images (nineteen features) to capture phenotypic details of tissues. A total of 146 features were extracted from preoperative CT data to reveal complex patterns of LN structures using in house developed scripts in Matlab® (v9.1.2, MathWorks, MA). The correlation of the features was demonstrated in a heat map representation in Fig. 2a. A regression model was generated to determine potential features using the least absolute shrinkage and selection operator (LASSO) algorithm with 10-fold cross-validation [26]. The variables, included in the regression model (variables with non-zero weights), were used to generate a classifier for the diagnosis of metastatic LNs. The behavior of the cross-validation meansquared error was shown in Fig. 2b. Moreover, Fig. 2c demonstrated the variation of the weights of the features while minimizing the cross-validation error.
After extraction of the quantitative CT image features, the patients were randomly separated into two groups by keeping the same distribution of the metastatic and normal LNs in training (80%) and test cohorts (%20). The training cohort consisted of 312 patients with 155 metastatic and 157 normal LNs while 78 patients with 39 metastatic and 39 normal LNs included in the test cohort. The patients in the training cohort were used to optimize the classification models within a 10-fold cross-validation framework and the test cohort was only used to evaluate the performance of the final classification models. The clinical model was generated employing LN size computed from ROIs drawn by the experienced radiologist. The LNs with a size of more than 10 mm was assumed as metastatic and remaining LNs as normal by following the clinical diagnosis approach.
The patient-demographic model was generated after analysis of patient characteristics acquired preoperatively (age, gender, histological grade, location, short-and long-axis diameters of LNs). The key features were selected according to the performance of the 10-fold cross-validation error of the generated classifiers during training and selected key features were used to generate the patient-demographic model.
The radiomic-derived model was constructed using the selected radiomics features (contrast and run percentage of intensity image, low gray level run emphasis of approximate wavelet image, contrast, and entropy of gradient image) with the LASSO algorithm of the patients in the training cohort by performing 10-fold cross-validation for performance evaluation of the generated model [27]. The patients in the test cohort were only used to evaluate the performance of classification for the diagnosis of metastatic LNs.
The diagnostic efficiency of these models was assessed using pathological reports of LNs in terms of accuracy, specificity, sensitivity, and area under the receiver operating curve (AUC) metrics. The AUC values were presented with 95% confidence interval and statistical difference among the generated models was evaluated using the DeLong method [28].

Statistical analysis
The categorical demographic characteristics of the patients were evaluated with the binomial test using GraphPad Prism (v7.0, La Jolla, CA). For numerical clinical variables, the Wilcoxon rank-sum test was utilized to investigate the statistical significance between patients with normal and metastatic LNs. p < 0.05 was accepted as statistically significant. The variables were presented as mean ± standard deviation.

Patient characteristics
A total of 390 patients (390 LNs) were incorporated in this study which 312 patients (157 normal and 155 metastatic LNs) were used in training cohort and 78  (Table 2). After the evaluation of six features of CC patients, the short-axis diameter of LN had the best classification accuracy with the least cross-validation error. Therefore, the patient-demographic classification model was generated using the short-axis diameter of LNs. The model demonstrated an accuracy of 67.31% for training and 73.08% for the test data which corresponds to the correct classification of 267 LNs (210 and 57 LNs in training and test cohorts, respectively  Fig. 4b. The key features for the radiomic-derived model were determined by performing LASSO regularization with 10-fold cross-validation among 146 CT image features. The selected five features were used to build a radiomicderived classification model. The radiomic-derived model demonstrated better performance for training (81.09%) and test cohorts (79.49%) in terms of accuracy with an increase of over 15% compared to the CT-image diagnostic criteria that corresponded to an additional 63 accurately diagnosed LNs. Besides, the model correctly identified a total of 315 LNs combined of 253 LNs from the training cohort and 62 LNs from the test cohort. The sensitivity of the training cohort was higher than the clinical model but similar to the clinical model while the radiomic-derived model generated lower sensitivity for the test cohort compared to CT-image diagnostic criteria but higher than the patient-demographic model. Specificity was 78.34% for training and 84.62% for the  Table 2 summarizes the prediction performance of the three models.

Discussion
In our study, we compared the diagnostic accuracy of clinical criteria to detect metastatic LNs in CC patients with two classification models such that the patientdemographic model utilizing short-axis LN diameter and radiomic-derived model incorporating five texture features of preoperative CT data. Our results demonstrated that the radiomic-derived model had significantly better performance compared to the clinical diagnostic criteria and patient-demographic model for the detection of normal and metastatic LNs in CC patients (Table 2).
Metastatic LN plays a crucial role in preoperative stage evaluation and development of treatment planning. In clinical practice, metastatic LNs are identified based on long-axis diameter size during the evaluation of CT images [29][30][31]. However, the diagnostic performance of LNs in clinical studies is widely affected due to unreliability of LN size for diagnosis of nodal metastasis in CC [32] such that a clinical study demonstrated that metastatic LN was detected with an accuracy of 54% in CC patients using CT [6]. In our study, we used clinically accepted diagnostic criteria for metastatic LNs (> 10 mm) to evaluate the detection performance of metastatic LNs [7]. The histogram with a resolution (width size of the bin) of 0.25 mm demonstrated that 298 LNs (normal or metastatic) were clustered in the same bins resulting 74.87% overlap; therefore, clinical diagnosis criteria had a correct classification for 253 LNs among 390. Other studies investigated morphological characteristics of LNs e.g. visible internal heterogeneity and irregular boundary, to differentiate normal and metastatic LNs in CC patients that improved the specificity from 63% to 73%, however, the long-term effect is still being investigated [33][34][35]. Due to the limited presence of visible heterogeneity and irregularity of boundary for metastases on CT images, Rollven et al. suggested that morphological CT criteria are not sufficient for nodal staging [36]. Therefore, better tools are urgently needed to accurately diagnose LN metastasis preoperatively.
Radiomics, computing high-dimensional quantitative features, has demonstrated potential benefits for different types of applications e.g. diagnosis, prognosis, prediction of treatment outcomes or overall survival. Ji et al. developed a radiomic signature interpreting the quantitative features of preoperative CT data to diagnose metastasis LN in biliary tract cancer which was determined with a blood test [14]. The model had an AUC of 0.81 for training and 0.80 for validation cohorts. Besides, Shen et al. developed a multivariable model to diagnose  There were several limitations to our study. It was a retrospective study that included only the largest regional LN from each patient to obtain pathological validation; therefore, our findings will benefit from a prospective study designed to collect multiple LNs from each patient. Due to performing a monocentric study, we could not evaluate the reproducibility of the radiomics features that may be affected by the acquisition parameters of the monocentric study design. Therefore, multicenter studies with different CT data acquisition parameters may improve the performance of the diagnosis with the assessment of the reproducibility of these features. Additionally, the ROIs of the LNs were drawn and validated using a manual approach by two experienced radiologists. Although manual segmentation is a commonly implemented approach in clinical studies, the implementation of automated segmentation would decrease the time required for the preparation of data. Finally, our study lacks postoperative follow-up data, so we could not examine the relationship between the texture of CT data and survival outcomes. Future studies are needed to evaluate the correlation between LN image biomarkers and overall survival.

Conclusion
Our study demonstrated that a radiomics model can be used to detect metastatic LNs preoperatively in CC patients, which can improve the diagnostic accuracy compared to the current clinical standard for diagnosis of nodal metastasis. The kernel-based SVM classification model had significantly better diagnostic performance than clinical and patient-demographic models. The findings of our study may be helpful for the selection of suitable treatment approaches for CC patients to improve the survival rates of the patients.