Skip to main content

Preoperative assessment of lymph node metastasis in Colon Cancer patients using machine learning: a pilot study



Preoperative detection of lymph node (LN) metastasis is critical for planning treatments in colon cancer (CC). The clinical diagnostic criteria based on the size of the LNs are not sensitive to determine metastasis using CT images. In this retrospective study, we investigated the potential value of CT texture features to diagnose LN metastasis using preoperative CT data and patient characteristics by developing quantitative prediction models.


A total of 390 CC patients, undergone surgical resection, were enrolled in this monocentric study. 390 histologically validated LNs were collected from patients and randomly separated into training (312 patients, 155 metastatic and 157 normal LNs) and test cohorts (78 patients, 39 metastatic and 39 normal LNs). Six patient characteristics and 146 quantitative CT imaging features were analyzed and key variables were determined using either exhaustive search or least absolute shrinkage algorithm. Two kernel-based support vector machine classifiers (patient-characteristic model and radiomic-derived model), generated with 10-fold cross-validation, were compared with the clinical model that utilizes long-axis diameter for diagnosis of metastatic LN. The performance of the models was evaluated on the test cohort by computing accuracy, sensitivity, specificity, and area under the receiver operating curve (AUC).


The clinical model had an overall diagnostic accuracy of 64.87%; specifically, accuracy of 65.38% and 62.82%, sensitivity of 83.87% and 84.62%, and specificity of 47.13% and 41.03% for training and test cohorts, respectively. The patient-demographic model obtained accuracy of 67.31% and 73.08%, the sensitivity of 62.58% and 69.23%, and specificity of 71.97% and 76.23% for training and test cohorts, respectively. Besides, the radiomic-derived model resulted in an accuracy of 81.09% and 79.49%, sensitivity of 83.87% and 74.36%, and specificity of 78.34% and 84.62% for training and test cohorts, respectively. Furthermore, the diagnostic performance of the radiomic-derived model was significantly higher than clinical and patient-demographic models (p < 0.02) according to the DeLong method.


The texture of the LNs provided characteristic information about the histological status of the LNs. The radiomic-derived model leveraging LN texture provides better preoperative diagnostic accuracy for the detection of metastatic LNs compared to the clinically accepted diagnostic criteria and patient-demographic model.


Colon cancer (CC) is a leading cause of cancer morbidity and mortality in the world with more than one million new cases in 2018 [1]. Despite advancements in treatment options of this disease, standard curative treatment is still complete resection of the primary tumor with dissection of regional lymph nodes (LNs) [2]. The presence of LN metastases plays a crucial role in the management and treatment strategy in CC [3, 4]. In clinical practice, preoperative identification of the histological status of LNs provides the basis for accurate planning of surgery, which not only ensures the quality and quantity of LN dissection but also avoids the omission of suspected LNs. Also, the presence of metastatic LNs determines the potential benefit of neoadjuvant chemotherapy in select CC patients. The recent Foxtrot trial investigated the potential efficiency of neoadjuvant chemotherapy administered to CC patients with metastatic LNs [5]. Accurate preoperative detection of metastatic LNs, therefore, is critical for generating an effective and individualized treatment plan for CC patients.

Computed tomography (CT) is an initial diagnostic tool for the evaluation of CC disease in clinical examination [5]. Despite well performance for assessment of T-stage of tumor using this imaging modality, diagnostic accuracy for regional metastasis is only 54% for the patients using current diagnostic criteria based on size of the LNs (metastatic LNs > 10 mm) [6] which demonstrates the unreliability of using the size to evaluate regional metastases in CC patients [7]. Although the clinicopathological characteristics demonstrate the potential for diagnosis of LN metastases [8,9,10], the requirement of surgical resection or biopsy for confirmation limits their usage in clinical practice. Therefore, novel approaches that use conventional CT imaging data to detect regional metastases are needed to develop better preoperative treatment planning for CC patients.

Radiomics is an emerging translational field of research that aims to describe tissue characteristics extracting high throughput quantitative features or biomarkers from multi-modality medical imaging data [11, 12]. Due to the ability to reveal complex intra-tumor heterogeneity, radiomics is considered to be a powerful tool in modern medicine including diagnosis, tumor characterization and prognosis [13]. In recent years, it has been used for the diagnosis of metastatic LNs in bladder, lung, biliary-tract, and esophagus cancers with satisfactory results [14,15,16,17]. However, a small number of these studies benefit from pathological verification and there is a paucity of quantitative analysis in CC to predict metastasis of regional LNs.

The purpose of this study was to develop and validate machine learning models that utilize either patient-characteristics or quantitative CT texture features for preoperative accurate diagnosis of metastasis in regional LNs and to compare their performance with clinical diagnosis criteria for LNs in CC patients.



A total of 390 patients were selected from the patient database of single-institution for this retrospective study among 598 patients who were diagnosed with CC and received colectomy with the removal of regional LNs from January 2014 to May 2018 in the Affiliated Hospital of Qingdao University. The detailed patient recruitment pathway with inclusion and exclusion criteria is described in Fig. 1.

Fig. 1

Recruitment pathway for patients in this study

Clinical data, including age, gender, and primary tumor site, were collected by reviewing medical records. Histological grade, T-stages, nerve invasion, and vessel invasion were obtained directly from pathological reports. The stage of the tumors was determined by surgical oncologists according to the American Joint Committee on Cancer (AJCC) TNM staging system, 8th edition [18].

CT image acquisition

The CC patients were scanned before radical resection of colon tumors using Siemens Somatom Sensation 64 CT scanner (Siemens Medical Solutions, Erlangen, Germany). The CT imaging parameters were selected by radiologists and kept the same for all the patients which were 120 kV; 200 effective mAs; beam collimation of 64 mm × 0.6 mm; a matrix of 512 × 512; a pitch of 0.8; and a gantry rotation time of 0.5 s. The CT image data were reconstructed with a slice thickness of 5 mm and resampled using b-spline interpolation to set the in-plane resolution to 1 mm before the feature extraction process.

Lymph node labeling and segmentation on CT data

After preoperative CT data acquisition, tumors and regional LNs were evaluated according to their anatomical regions as a standard procedure before the surgery. During the surgery, both tumor and LNs were removed and the dissected LNs were separated into different groups according to anatomical location. Following the surgery, all the resected LNs were sent to a pathology laboratory for histological analysis. The morphological and histological features of the LNs were recorded as metadata on patient records after pathological analysis. We generated a patient cohort using histology reports and preoperative CT data with LN markings to identify the location and the histological status of the LNs. In order to ensure the validity of the LN histology, the largest and most adjacent LNs to tumors were selected. Afterward, the largest LN from each patient was manually outlined to generate a region of interest (ROI) on the slice with maximal in-plane diameter using ITK-SNAP software [19] by an experienced radiologist. Later, these ROIs were validated by a senior radiologist in abdominal radiology.

Feature extraction, selection, and model building

Before computing the features, CT data was quantized using a fixed number of bin sizes (8 bins) and rescaled into the range of [0,1] using min-max normalization. The quantitative CT image features were computed by employing six feature extraction methods, e.g. first-order statistics (FoS) (six features), gray-level co-occurrence matrix (GLCM) (six features), gray-level run-length matrix (GLRLM) (seven features), local binary pattern (LBP) (ten features), fractal analysis (FA) (one feature) and shape features (nine features). The FoS were computed to summarize the distribution of the intensity values of the CT image data regardless of spatial positioning [20]. Besides, GLCM features were computed for analysis of the tissue texture evaluating the spatial relationship of the voxels and GLRLM features were utilized to interpret coarseness of the texture by computing in four directions (0°, 45°, 90°, 135°) [21, 22]. Afterward, GLCM and GLRLM features computed for each direction were merged by averaging into a vector. Local binary patterns were used to describe local spatial patterns of intensity images while fractal analysis was performed to measure the rate of changing complexity of the texture with scale variation [23, 24]. The shape features were computed to interpret the structural characteristics of the tissues from generated ROIs. Besides, two image filters were utilized to capture the texture characteristics of the LNs in wavelet and gradient domains. The wavelet coefficient images were computed using Daubechies kernel function for analysis of the localized characteristics of the images at different scales while gradient images are the measurement of the directional changes of the image intensity [25]. FoS, GLRLM and GLCM features were computed from first level wavelet decomposition images (eighty-eight features). Besides, FoS, GLCM, and GLRLM features were also extracted from gradient images (nineteen features) to capture phenotypic details of tissues. A total of 146 features were extracted from preoperative CT data to reveal complex patterns of LN structures using in house developed scripts in Matlab® (v9.1.2, MathWorks, MA). The correlation of the features was demonstrated in a heat map representation in Fig. 2a. A regression model was generated to determine potential features using the least absolute shrinkage and selection operator (LASSO) algorithm with 10-fold cross-validation [26]. The variables, included in the regression model (variables with non-zero weights), were used to generate a classifier for the diagnosis of metastatic LNs. The behavior of the cross-validation mean-squared error was shown in Fig. 2b. Moreover, Fig. 2c demonstrated the variation of the weights of the features while minimizing the cross-validation error.

Fig. 2

The correlation of the textural features and selection of a subset of features of lymph nodes using the least absolute shrinkage and selection operator regularization. Abbreviations: DF, Degree of freedom; F8, Contrast; F16, Run percentage; F96, Low gray level run emphasis of approximate wavelet image; F126, Contrast of gradient image; F129, Entropy of the gradient image

After extraction of the quantitative CT image features, the patients were randomly separated into two groups by keeping the same distribution of the metastatic and normal LNs in training (80%) and test cohorts (%20). The training cohort consisted of 312 patients with 155 metastatic and 157 normal LNs while 78 patients with 39 metastatic and 39 normal LNs included in the test cohort. The patients in the training cohort were used to optimize the classification models within a 10-fold cross-validation framework and the test cohort was only used to evaluate the performance of the final classification models.

The clinical model was generated employing LN size computed from ROIs drawn by the experienced radiologist. The LNs with a size of more than 10 mm was assumed as metastatic and remaining LNs as normal by following the clinical diagnosis approach.

The patient-demographic model was generated after analysis of patient characteristics acquired preoperatively (age, gender, histological grade, location, short- and long-axis diameters of LNs). The key features were selected according to the performance of the 10-fold cross-validation error of the generated classifiers during training and selected key features were used to generate the patient-demographic model.

The radiomic-derived model was constructed using the selected radiomics features (contrast and run percentage of intensity image, low gray level run emphasis of approximate wavelet image, contrast, and entropy of gradient image) with the LASSO algorithm of the patients in the training cohort by performing 10-fold cross-validation for performance evaluation of the generated model [27]. The patients in the test cohort were only used to evaluate the performance of classification for the diagnosis of metastatic LNs.

The diagnostic efficiency of these models was assessed using pathological reports of LNs in terms of accuracy, specificity, sensitivity, and area under the receiver operating curve (AUC) metrics. The AUC values were presented with 95% confidence interval and statistical difference among the generated models was evaluated using the DeLong method [28].

Statistical analysis

The categorical demographic characteristics of the patients were evaluated with the binomial test using GraphPad Prism (v7.0, La Jolla, CA). For numerical clinical variables, the Wilcoxon rank-sum test was utilized to investigate the statistical significance between patients with normal and metastatic LNs. p < 0.05 was accepted as statistically significant. The variables were presented as mean ± standard deviation.


Patient characteristics

A total of 390 patients (390 LNs) were incorporated in this study which 312 patients (157 normal and 155 metastatic LNs) were used in training cohort and 78 patients in test cohort (39 normal and 39 metastatic LNs). The area of the normal LNs was measured as 90.49 ± 114.04 mm2 (Median: 55, Range: [10, 816]) while metastatic LNs had an area of 185.31 ± 224.51 mm2 (Median: 122, Range: [12, 1724]). The clinicopathologic characteristics of patients are summarized in Table 1. There were no statistically significant differences in gender, age, location and histological grade of the tumor, however, perineural invasion, vascular invasion, and T-stage demonstrated a statistically significant difference (p < 0.05) between the patients with normal and metastatic LNs.

Table 1 Characteristics of patients with normal and metastatic LNs

Performance evaluation

In the clinical model, metastatic LNs were differentiated from normal LNs by evaluating the diameter of the LNs in the direction of the longest axis. In our patient cohort, including patients selected for training and test process, had a long-axis diameter of 12.12 ± 5.74 mm for normal LNs while metastatic LNs had a diameter of 17.37 ± 8.48 mm (Fig. 3a). Wilcoxon rank-sum test showed that the long-axis diameter of the metastatic LNs was statistically different from normal LNs with a p < 0.01 (95% confidence interval [CI]: 3.81, 6.70]. However, the histogram of LNs with a resolution of 0.25 mm demonstrated that 74.87% of normal and metastatic LNs were clustered together in the same bins (Fig. 3b); therefore, 64.87% of the LNs were diagnosed correctly using clinical diagnostic criteria that correspond to correct classification of 253 LNs (204 and 49 LNs in training and test cohorts, respectively) in 390 CC patients (Fig. 3c). Specifically, 65.38% of the patients in the training cohort had a correct diagnosis while the diagnostic performance for the test cohort was 62.82%. Besides, the model had an AUC of 0.704 (95% CI: 0.675, 0.733) for training and 0.772 (95% CI: 0.718, 0.825) for test cohorts (Fig. 4a). The clinical model obtained a sensitivity of 83.87% and 84.62% for training and test cohorts while the specificity of 47.13% and 41.03% was observed for training and test cohorts, respectively (Table 2).

Fig. 3

Evaluation of lymph nodes using current CT image diagnostic criteria

Fig. 4

Receiver operating characteristics curves of the CT image diagnostic criteria, clinical and radiomics models for training and test cohorts

Table 2 Predictive performance of the CT diagnostic criteria and generated classifiers

After the evaluation of six features of CC patients, the short-axis diameter of LN had the best classification accuracy with the least cross-validation error. Therefore, the patient-demographic classification model was generated using the short-axis diameter of LNs. The model demonstrated an accuracy of 67.31% for training and 73.08% for the test data which corresponds to the correct classification of 267 LNs (210 and 57 LNs in training and test cohorts, respectively). Moreover, the model had a sensitivity of 62.58% for training and 69.23% for the test data while obtaining a specificity of 71.97% and 76.92% for training and test set, respectively. This model showed an AUC of 0.706 (95% CI: 0.677, 0.735) for training and 0.773 (95% CI: 0.720, 0.827) for test data (Table 2). There was no statistically significant improvement compared to the clinical model for training and test cohorts (p = [0.982, 0.997]). The classifier performance for training and test cohorts are presented in Fig. 4b.

The key features for the radiomic-derived model were determined by performing LASSO regularization with 10-fold cross-validation among 146 CT image features. The selected five features were used to build a radiomic-derived classification model. The radiomic-derived model demonstrated better performance for training (81.09%) and test cohorts (79.49%) in terms of accuracy with an increase of over 15% compared to the CT-image diagnostic criteria that corresponded to an additional 63 accurately diagnosed LNs. Besides, the model correctly identified a total of 315 LNs combined of 253 LNs from the training cohort and 62 LNs from the test cohort. The sensitivity of the training cohort was higher than the clinical model but similar to the clinical model while the radiomic-derived model generated lower sensitivity for the test cohort compared to CT-image diagnostic criteria but higher than the patient-demographic model. Specificity was 78.34% for training and 84.62% for the test cohorts. In addition, the classifier model showed a significant increase in AUC for training (17.6%, p < 0.001) and test groups (5.2%, p < 0.02) resulting in an AUC of 0.882 [95% CI: 0.862, 0.901] for training and 0.825 [95% CI: 0.778, 0.872] for the test cohorts. Figure 4c portrays the performance of the model for the training and test sets. Moreover, Table 2 summarizes the prediction performance of the three models.


In our study, we compared the diagnostic accuracy of clinical criteria to detect metastatic LNs in CC patients with two classification models such that the patient-demographic model utilizing short-axis LN diameter and radiomic-derived model incorporating five texture features of preoperative CT data. Our results demonstrated that the radiomic-derived model had significantly better performance compared to the clinical diagnostic criteria and patient-demographic model for the detection of normal and metastatic LNs in CC patients (Table 2).

Metastatic LN plays a crucial role in preoperative stage evaluation and development of treatment planning. In clinical practice, metastatic LNs are identified based on long-axis diameter size during the evaluation of CT images [29,30,31]. However, the diagnostic performance of LNs in clinical studies is widely affected due to unreliability of LN size for diagnosis of nodal metastasis in CC [32] such that a clinical study demonstrated that metastatic LN was detected with an accuracy of 54% in CC patients using CT [6]. In our study, we used clinically accepted diagnostic criteria for metastatic LNs (> 10 mm) to evaluate the detection performance of metastatic LNs [7]. The histogram with a resolution (width size of the bin) of 0.25 mm demonstrated that 298 LNs (normal or metastatic) were clustered in the same bins resulting 74.87% overlap; therefore, clinical diagnosis criteria had a correct classification for 253 LNs among 390. Other studies investigated morphological characteristics of LNs e.g. visible internal heterogeneity and irregular boundary, to differentiate normal and metastatic LNs in CC patients that improved the specificity from 63% to 73%, however, the long-term effect is still being investigated [33,34,35]. Due to the limited presence of visible heterogeneity and irregularity of boundary for metastases on CT images, Rollven et al. suggested that morphological CT criteria are not sufficient for nodal staging [36]. Therefore, better tools are urgently needed to accurately diagnose LN metastasis preoperatively.

Radiomics, computing high-dimensional quantitative features, has demonstrated potential benefits for different types of applications e.g. diagnosis, prognosis, prediction of treatment outcomes or overall survival. Ji et al. developed a radiomic signature interpreting the quantitative features of preoperative CT data to diagnose metastasis LN in biliary tract cancer which was determined with a blood test [14]. The model had an AUC of 0.81 for training and 0.80 for validation cohorts. Besides, Shen et al. developed a multivariable model to diagnose LN status for esophageal cancer patients by utilizing CT-reported LN metastasis status, CT-reported positions and 13 texture-based features. The model obtained an AUC of 0.806 and 0.771 for training and test cohorts, respectively. Despite other studies focusing on the prediction of metastatic LNs in several cancer types [14,15,16,17], there is still a paucity of research that integrates radiomics features with machine learning to diagnose metastatic LNs with pathological validation [37]. In this study, we developed two machine learning models, e.g. patient-demographic utilizing selected features from patient characteristics and radiomic-derived model benefiting of quantitative imaging features computed from preoperative CT data, and compared with the clinical diagnosis criteria (clinical model) for LN metastasis. While the clinical model (AUC of 0.772) and the patient-demographic model (AUC of 0.773) had similar diagnostic accuracy for the test cohort (p = 0.987), the radiomic-derived model obtained a statistically significant improvement in diagnostic performance (AUC of 0.825, p < 0.02). Specifically, 226 LNs were correctly classified by radiomic-derived and clinical model, while 89 LNs detected with the radiomic-derived model only and 27 LNs by clinical model. Besides, there were 48 LNs were misclassified by both radiomic-derived and clinical models).

There were several limitations to our study. It was a retrospective study that included only the largest regional LN from each patient to obtain pathological validation; therefore, our findings will benefit from a prospective study designed to collect multiple LNs from each patient. Due to performing a monocentric study, we could not evaluate the reproducibility of the radiomics features that may be affected by the acquisition parameters of the monocentric study design. Therefore, multicenter studies with different CT data acquisition parameters may improve the performance of the diagnosis with the assessment of the reproducibility of these features. Additionally, the ROIs of the LNs were drawn and validated using a manual approach by two experienced radiologists. Although manual segmentation is a commonly implemented approach in clinical studies, the implementation of automated segmentation would decrease the time required for the preparation of data. Finally, our study lacks postoperative follow-up data, so we could not examine the relationship between the texture of CT data and survival outcomes. Future studies are needed to evaluate the correlation between LN image biomarkers and overall survival.


Our study demonstrated that a radiomics model can be used to detect metastatic LNs preoperatively in CC patients, which can improve the diagnostic accuracy compared to the current clinical standard for diagnosis of nodal metastasis. The kernel-based SVM classification model had significantly better diagnostic performance than clinical and patient-demographic models. The findings of our study may be helpful for the selection of suitable treatment approaches for CC patients to improve the survival rates of the patients.

Availability of data and materials

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.



American joint committee of cancer


Area under the receiver operating curve


Colon cancer


Confidence interval


Computed tomography


Fractal analysis


First order statistics


Gray level co-occurrence matrix


Gray level run-length matrix


Local binary patterns


Lymph node


Least absolute shrinkage and selection operator


Radial basis function


Region of interest


Support vector machines


  1. 1.

    Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394.

    Article  Google Scholar 

  2. 2.

    Kobayashi H, Enomoto M, Higuchi T, Uetake H, Iida S, Ishikawa T, et al. Clinical significance of lymph node ratio and location of nodal involvement in patients with right colon cancer. Dig Surg. 2011;28(3):190–7.

    Article  Google Scholar 

  3. 3.

    Grothey A, Sargent DJ. Adjuvant therapy for Colon Cancer: small steps toward precision medicine. JAMA Oncology. 2016;2(9):1133–4.

    Article  Google Scholar 

  4. 4.

    Benson AB, Venook AP, Al-Hawary MM, Cederquist L, Chen Y-J, Ciombor KK, et al. Rectal cancer, version 2.2018, NCCN clinical practice guidelines in oncology. J Natl Compr Cancer Netw. 2018;16(7):874.

    Article  Google Scholar 

  5. 5.

    Dighe S, Swift I, Brown G. CT staging of colon cancer. Clin Radiol. 2008;63(12):1372–9.

    CAS  Article  Google Scholar 

  6. 6.

    de Vries FE, da Costa DW, van der Mooren K, van Dorp TA, Vrouenraets BC. The value of pre-operative computed tomography scanning for the assessment of lymph node status in patients with colon cancer. Eur J Surg Oncol. 2014;40(12):1777–81.

    Article  Google Scholar 

  7. 7.

    Schrembs P, Martin B, Anthuber M, Schenkirsch G, Märkl B. The prognostic significance of lymph node size in node-positive colon cancer. PloS One. 2018;13(8):e0201072–e.

    Article  Google Scholar 

  8. 8.

    Sakuragi M, Togashi K, Konishi F, Koinuma K, Kawamura Y, Okada M, et al. Predictive factors for lymph node metastasis in t1 stage colorectal carcinomas. Dis Colon Rectum. 2003;46(12):1626–32.

    Article  Google Scholar 

  9. 9.

    Davison JM, Landau MS, Luketich JD, McGrath KM, Foxwell TJ, Landsittel DP, et al. A model based on pathologic features of superficial esophageal adenocarcinoma complements clinical node staging in determining risk of metastasis to lymph nodes. Clin Gastroenterol Hepatol. 2016;14(3):369–77.e3.

    Article  Google Scholar 

  10. 10.

    Zang RC, Qiu B, Gao SG, He J. A model predicting lymph node status for patients with clinical stage T1aN0-2M0 nonsmall cell lung Cancer. Chin Med J. 2017;130(4):398–403.

    Article  Google Scholar 

  11. 11.

    Lambin P, Leijenaar RTH, Deist TM, Peerlings J, de Jong EEC, van Timmeren J, et al. Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol. 2017;14:749.

    Article  Google Scholar 

  12. 12.

    Rizzo S, Botta F, Raimondi S, Origgi D, Fanciullo C, Morganti AG, et al. Radiomics: the facts and the challenges of image analysis. Eur Radiol Exp. 2018;2(1):36.

    Article  Google Scholar 

  13. 13.

    Gillies RJ, Kinahan PE, Hricak H. Radiomics: images are more than pictures, they are data. Radiology. 2016;278(2):563.

    Article  Google Scholar 

  14. 14.

    Ji G-W, Zhang Y-D, Zhang H, Zhu F-P, Wang K, Xia Y-X, et al. Biliary tract Cancer at CT: a Radiomics-based model to predict lymph node metastasis and survival outcomes. Radiology. 2019;290(1):90.

    Article  Google Scholar 

  15. 15.

    Shen C, Liu Z, Wang Z, Guo J, Zhang H, Wang Y, et al. Building CT Radiomics based Nomogram for preoperative esophageal Cancer patients lymph node metastasis prediction. Transl Oncol. 2018;11(3):815–24.

    Article  Google Scholar 

  16. 16.

    Wu S, Zheng J, Li Y, Yu H, Shi S, Xie W, et al. A Radiomics Nomogram for the preoperative prediction of lymph node metastasis in bladder Cancer. Clin Cancer Res. 2017;23(22):6904–11.

    CAS  Article  Google Scholar 

  17. 17.

    Bayanati H, ET R, Souza CA, Sethi-Virmani V, Gupta A, Maziak D, et al. Quantitative CT texture and shape analysis: can it differentiate benign and malignant mediastinal lymph nodes in patients with primary lung cancer? Eur Radiol. 2015;25(2):480–7.

    Article  Google Scholar 

  18. 18.

    Amin MB, Greene FL, Edge SB, Compton CC, Gershenwald JE, Brookland RK, et al. The eighth edition AJCC Cancer staging manual: continuing to build a bridge from a population-based to a more “personalized” approach to cancer staging. CA Cancer J Clin. 2017;67(2):93–9.

    Article  Google Scholar 

  19. 19.

    Yushkevich PA, Piven J, Hazlett HC, Smith RG, Ho S, Gee JC, et al. User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability. NeuroImage. 2006;31(3):1116–28.

    Article  Google Scholar 

  20. 20.

    Borchani M, Stamon G. Texture features for image classification and retrieval. 1997. p. 401–6.

  21. 21.

    Haralick RM, Shanmugam K, Dinstein I. Textural features for image classification. IEEE Trans Syst, Man, Cybern. 1973;SMC-3(6):610–21.

    Article  Google Scholar 

  22. 22.

    Galloway MM. Texture analysis using gray level run lengths. Comput Graph Image Process. 1975;4(2):172–9.

    Article  Google Scholar 

  23. 23.

    Ojala T, Pietikainen M, Maenpaa T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell. 2002;24(7):971–87.

    Article  Google Scholar 

  24. 24.

    Falconer KJ. Fractal geometry : mathematical foundations and applications. 2nd ed. Chichester: Chichester : Wiley; 2003.

    Google Scholar 

  25. 25.

    Arivazhagan S, Ganesan L. Texture classification using wavelet transform. Pattern Recogn Lett. 2003;24(9):1513–21.

    Article  Google Scholar 

  26. 26.

    Tibshirani R. Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B Methodol. 1996;58(1):267–88.

    Google Scholar 

  27. 27.

    Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20(3):273–97.

    Google Scholar 

  28. 28.

    DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837–45.

    CAS  Article  Google Scholar 

  29. 29.

    Sloothaak DA, Grewal S, Doornewaard H, van Duijvendijk P, Tanis PJ, Bemelman WA, et al. Lymph node size as a predictor of lymphatic staging in colonic cancer. Br J Surg. 2014;101(6):701–6.

    CAS  Article  Google Scholar 

  30. 30.

    Tanaka T, Nozawa H, Kawai K, Hata K, Kiyomatsu T, Nishikawa T, et al. Lymph node size on computed tomography images is a predictive Indicator for lymph node metastasis in patients with colorectal neuroendocrine tumors. In Vivo. 2017;31(5):1011–7.

    PubMed  PubMed Central  Google Scholar 

  31. 31.

    Veit-Haibach P, Kuehle CA, Beyer T, Stergar H, Kuehl H, Schmidt J, et al. Diagnostic accuracy of colorectal cancer staging with whole-body PET/CT colonography. Jama. 2006;296(21):2590–600.

    CAS  Article  Google Scholar 

  32. 32.

    Nerad E, Lahaye MJ, Maas M, Nelemans P, Bakers FCH, Beets GL, et al. Diagnostic accuracy of CT for local staging of Colon Cancer: a systematic review and meta-analysis. Am J Roentgenol. 2016;207(5):984–95.

    Article  Google Scholar 

  33. 33.

    Mayr P, Aumann G, Schaller T, Schenkirsch G, Anthuber M, Markl B. Lymph node hypoplasia is associated with adverse outcomes in node-negative colon cancer using advanced lymph node dissection methods. Langenbeck's Arch Surg. 2016;401(2):181–8.

    Article  Google Scholar 

  34. 34.

    Rollven E, Blomqvist L, Oistamo E, Hjern F, Csanaky G, Abraham-Nordling M. Morphological predictors for lymph node metastases on computed tomography in colon cancer. Abdom Radiol (NY). 2019;44(5):1712–21.

    Article  Google Scholar 

  35. 35.

    Quan Q, Zhu M, Liu S, Chen P, He W, Huang Y, et al. Positive impact of the negative lymph node count on the survival rate of stage III colon cancer with pN1 and right-side disease. J Cancer. 2019;10(4):1052–9.

    CAS  Article  Google Scholar 

  36. 36.

    Rollven E, Abraham-Nordling M, Holm T, Blomqvist L. Assessment and diagnostic accuracy of lymph node status to predict stage III colon cancer using computed tomography. Cancer Imaging. 2017;17(1):3.

    Article  Google Scholar 

  37. 37.

    Y-q H, Liang C-h, He L, Tian J, Liang C-s, Chen X, et al. Development and validation of a Radiomics Nomogram for preoperative prediction of lymph node metastasis in colorectal Cancer. J Clin Oncol. 2016;34(18):2157–64.

    Article  Google Scholar 

Download references


Not applicable.


This study was supported by the National Cancer Institute (grants R01CA209886, R01CA196967), by 2019 Harold E. Eisenberg Foundation Scholar Award and by the Fishel Fellowship Award at the Robert H. Lurie Comprehensive Cancer Center.

Author information




A. Eresen and Y. Li performed the experiments, involved in data analysis and manuscript preparation. J. Yang, J. Shangshuan, Y. Velichko, V. Yaghmai and A.B. Benson III significantly involved in data analysis, interpretation, and manuscript preparation. Z. Zhang designed the study and significantly involved data analysis and manuscript preparation. The author (s) read and approved the final manuscript.

Corresponding authors

Correspondence to Al B. Benson III or Zhuoli Zhang.

Ethics declarations

Ethics approval and consent to participate

Our retrospective study was approved by the medical ethics committee of the Affiliated Hospital of Qingdao University and informed consent was waived.

Consent for publication

Not applicable.

Competing interests

The authors declare no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Eresen, A., Li, Y., Yang, J. et al. Preoperative assessment of lymph node metastasis in Colon Cancer patients using machine learning: a pilot study. Cancer Imaging 20, 30 (2020).

Download citation


  • Colon cancer
  • Computed tomography
  • Machine learning
  • Metastatic lymph node
  • Texture analysis