Skip to content


  • Regular article
  • Open Access

CT texture analysis of lung adenocarcinoma: can Radiomic features be surrogate biomarkers for EGFR mutation statuses

Cancer Imaging201818:52

  • Received: 8 March 2018
  • Accepted: 3 December 2018
  • Published:



To investigate whether radiomic features can be surrogate biomarkers for epidermal growth factor receptor (EGFR) mutation statuses.

Materials and methods

Two hundred ninety six consecutive patients, who underwent CT examinations before operation within 3 months and had EGFR mutations tested, were enrolled in this retrospective study. CT texture features were extracted using an open-source software with whole volume segmentation. The association between CT texture features and EGFR mutation statuses were analyzed.


In the 296 patients, there were 151 patients with EGFR mutations (51%). Logistic analysis identified that lower age (Odds Ratio[OR]: 0.968,95% confidence interval [CI]:0.946~0.990, p = 0.005) and a radiomic feature named GreyLevelNonuniformityNormalized (OR: 0.012, 95% CI:0.000~0.352, p = 0.01) were predictors for exon 19 mutation; higher age (OR: 1.027, 95%CI:1.003~1.052,p = 0.025), female sex (OR: 2.189, 95%CI:1.264~3.791, p = 0.005) and a radiomic feature named Maximum2DDiameterColumn (OR: 0.968, 95%CI:0.946~0.990], p = 0.005) for exon 21 mutation; and female sex (OR: 1.883,95%CI:1.064~3.329, p = 0.030), non-smoking status (OR: 2.070, 95%CI:1.090~3.929, p = 0.026) and a radiomic feature termed SizeZone NonUniformityNormalized (OR: 0.010, 95% CI:0.0001~0.852, p = 0.042) for EGFR mutations. Areas under the curve (AUCs) of combination with clinical and radiomic features to predict exon 19 mutation, exon 21 mutation and EGFR mutations were 0.655, 0.675 and 0.664, respectively.


Several radiomic features are associated with EGFR mutation statuses of lung adenocarcinoma. Combination with clinical files, moderate diagnostic performance can be obtained to predict EGFR mutation status of lung adenocarcinoma. Radiomic features might harbor potential surrogate biomarkers for identification of EGRF mutation statuses.


  • Lung adenocarcinoma
  • Computed tomography
  • Radiomics
  • Epidermal growth factor receptor


Lung cancer is the leading cause of cancer-related death for both men and women [1]. Non–small cell lung cancer (NSCLC) accounts for 85–90% of lung cancers, while lung adenocarcinoma is the most diagnosed histological subtype of NSCLC [2]. In the past decade treatment for NSCLC has evolved from the use of cytotoxic chemotherapy to personalized treatment based on molecular alterations, especially in the treatment of patients with epidermal growth factor receptor (EGFR) mutations [3]. Small molecule tyrosine kinase inhibitors had demonstrated a higher response rate to patients with EGFR mutation (60–80%) than those with EGFR wild type or unknown mutation status (10–20%) [4]. Comparing with EGFR wild type, longer progression-free survival (PFS) and higher objective radiographic response rates were also observed in patients with mutated EGFR lung cancer [5]. Therefore, acknowledge of EGFR mutation status of lung adenocarcinomas is essential for personalized therapy. As most solid tumors, lung adenocarcinomas also show widespread molecular heterogeneity. Regional heterogeneity in the distribution of mutations of lung adenocarcinoma genomes is also observed [3]. The heterogeneity is a critical barrier to the development of precision medicine approaches because the standard approach to tumor sampling, often invasive needle biopsy, is unable to fully capture the spatial state of the tumor [6]. On the other hand, molecular analysis is often expensive and is based on adequate tumor cells. At some times, rebiopsy is needed.

CT is the most common modality, which is used by clinicians to diagnosis, assessment of stage and treatment response of lung cancers. It can be accessible commonly and repeat at low price and little invasion. Several CT features had been revealed to be associated with EGFR mutation status [7]. Conventional analysis of medical imaging relies on visual assessment of radiologists. As the medical imaging is regarded as pictures, abundant data features beyond the naked eye ability may be abandoned. Radiomics, which uses computers to extract a large number of quantitative features from image data to explore biologic properties of lesions, and subsequently to analyze these features for decision making, has emerged as a promising technique to identify gene phenotype in several kinds of tumors. The latter is termed as radiogenomic. Using radiogenomic, Liu et al. disclosed that mutant EGFR status could be predicted by a set of 5 radiomic features [8]. As the lung cancer is of high mutation burden, EGFR mutation can occur from exon 18 to exon 21. The sensitivity of EGFR mutant tumors to molecule tyrosine kinase inhibitors is also different among the genotype of EGFR [9].

Exon 19 mutations are associated with a higher response rate and longer survival after treatment with tyrosine kinase inhibitors than exon 21 mutation [10, 11]. Exon 19 and 21 mutations consist of 90% of EGFR mutation in lung adenocarcinomas, identifying these two kinds of mutations is essential for personalized treatment [12]. We hypothesize that the differences between EGFR mutations may result into microstructure alternations. The aim of this study is to assess whether radiomic features using CT texture analysis can identify lung adenocarcinomas with EGFR exon 19 mutation and 21 mutation.

Materials and methods


This retrospective study was approved by the Medical Ethics Committee of Shenzhen People’s Hospital and informed consent was waived. From January 2011 to October 2017, 306 consecutive patients with pathologic proved lung adenocarcinoma were chosen from the database of our hospital. The inclusion criteria were: (1) with thoracic CT before operation within 3 months; (2) with results of EGFR mutation status; (3)with available clinical data.


CT scans were performed by 16-slice (Brilliance 16, Philips health system, Cleveland, USA) and 128-slice (Brilliance iCT, Philips health system, Cleveland, USA) CT scanner, without iv contrast medium administration. Collimation of 16 × 1.5 mm was used for the 16-slice scanner, while collimation of 128 × 0.625 mm was used for the 128-slice scanner, both with automatic adjustment of tube current. Images were reconstructed with slice thickness of 2 mm and 1 mm increment for 16-slice scanner, while for 128-slice scanner with slice thickness of 1.5 mm and 0.75 mm increment, both with Y-sharp filter. All the image data were read from the picture archive and communication system (PACS)..

Tumor segmentation and Radiomic features extraction

The three-dimensional volume CT image data were transferred into a computer and the radiomic features were extracted using an open-source software called PyRadiomics, which are available at The operation processing can be found in the reference [13]. The software can identify and segment pulmonary lesions automatically (Fig. 1). Manual adjustment of region of interest (ROI) was made for accurate segmentation by a radiologist with 3-year-experience in thoracic radiology. If there were multiple pulmonary lesions, the radiologist identified the tumor according to pathological recorder and surgical markers. Ninety-four texture features, including first order features (19 features), gray-level-co-occurrence matrix (GLCM) features (27 features), gray-level-run-length matrix (GLRLM) features (16 features), gray-level size zone matrix (GLSZM) features (16 features) and shape features (16 features), are extracted from the marketed lesions. The definition of these radiomic features are available at First order features describe the distribution of voxel intensities within the ROI using common and basic metrics. GLCM features describe the second-order joint probability function of an image region constrained by a mask. GLRLM features quantify gray level runs, which are defined as the length in number of pixels, of consecutive pixels that have the same gray level value. GLSZM features quantify gray level zones in an image. A gray level zone is defined as a the number of connected voxels that share the same gray level intensity. Shape features include descriptors of the three-dimensional size and shape of the ROI.
Fig. 1
Fig. 1

The process of PyRadiomics.The process includes the software automatically segments three lesions in the right lung at first, a radiologist identifies the cancer according to surgery record and makes some manual adjustments for accurate segmentation, then radiomic features are extracted and outputted for analysis

Statistical analysis

The statistical analysis was performed with IBM SPSS statistics version 24. A 2-sided P value of < .05 was regarded as statistically significant. Univariate analysis was performed first. Fisher exact test and the Kruskal-Wallis test were used for categorical and continuous variables, respectively. Then, all the clinical characteristics and radiomics features with statistical significance were entered logistic regression analysis. Receiver operating characteristic (ROC) curves were constructed and the Area under the curve (AUC) was calculated for predicting EGFR mutation status with radiomics features and the combination of radiomics features and clinical presentations, respectively.


Patients’ clinical features and EGFR mutations

The final study population included 296 patients who fulfilled the inclusion criteria, ten patients were excluded (3 for the interval between CT scanning and operation beyond 3 month; 1 for multiple lesions and only one lesion harbored adenocarcinoma, which a corresponding relationship could not be established due to illegible pathological recorder; 4 patients received chemotherapy or radiotherapy before operation; 2 for tumor margin could not be segmented on CT images due to lesions adjacent to pulmonary hilar). A flow chart of the study population is presented in Fig. 2. In the 296 patients, there were 5 tumors with EGFR exon 18 mutation, 66 with exon 19 mutation, 7 with exon 20 mutation and 78 with exon 21 mutation. 5 tumors harbored 2 exon mutations. No patients had more than 2 exon mutations. Therefore, there were 151 patients with EGFR mutations with a prevalence of 51%. The patients’ clinical features were presented in Table 1. EGFR mutations were more common in females and nonsmoker with statistical significance.
Fig. 2
Fig. 2

The workflow diagram shows selection of study population and exclusion criteria

Table 1

Clinical features of patients with lung adenocarcinoma


EGRF mutation

(n = 151)

EGFR wild type

(n = 145)

P value


56.69 ± 12.30

60.43 ± 12.23













Smoking status











Texture analysis and diagnostic performance

Due to only 5 and 7 tumors with exon 18 and 20 exon mutations, the present study focused on exon 19 and 21 mutations, respectively. Then all the exon mutations were taken into account and the patients were dichotomized into with and without EGFR mutation. At univariate analysis, 21 radiomic features and age were significantly associated with exon 19 mutation; 21 radiomic features, smoking status and gender with exon 21 mutation; 2 radiomic features, smoking status and gender with EGFR mutations. The variables with statistical significance at univariate analysis and Odds Ratio at logistic analysis were presented in Table 2.
Table 2

Variables with statistical significance at univariate analysis and logistic regression



p value

logistic regression

Odds Ratio (95% confidence interval)

p value

Exon 19 mutation







































































Exon 21 mutation






Smoking status









































































EGFR mutation






Smoking status












Regarding to exon 19 mutation, logistic analysis showed that lower age (Odds Ratio[OR]: 0.968, 95% confidence interval [CI]:0.946~0.990, p = 0.005) and a radiomic GLCM feature named GreyLevelNonuniformityNormalized (OR: 0.012, 95%CI:0.000~0.352, p = 0.01) were the risk factors. The Area under the curve (AUC) of the radiomic feature to predict exon 19 mutation was 0.609. After combining age, AUC reached 0.655 (Fig. 3).
Fig. 3
Fig. 3

ROC curve of the radiomic GLCM feature named GreyLevelNonuniformityNormalized and combination of radiomic feature and clinical feature to predict exon 19 mutation

Higher age (OR: 1.027, 95%CI:1.003~1.052, p = 0.025), female sex (OR: 2.189 95%CI:1.264~3.791, p = 0.005) and a radiomic shape feature named Maximum2DDiameterColumn (OR: 0.968,95%CI:0.946~0.990, p = 0.005) were found to be associated with exon 21 mutation at logistic analysis. The AUC of the radiomic feature only and combination of clinical presentations to predict exon 21 mutation was 0.603 and 0.675, respectively (Fig. 4).
Fig. 4
Fig. 4

ROC curve of the radiomic shape feature named Maximum2DDiameterColumn and combination of radiomic feature and clinical feature to predict exon 21 mutation

When the patients were dichotomized into with and without EGFR mutation, female sex (OR: 1.883, 95%CI:1.064~3.329, p = 0.030), non-smoking status (OR: 2.070, 95%CI:1.090~3.929, p = 0.026) and a radiomic GLSZM feature termed SizeZoneNonUniformityNormalized (OR: 0.010, 95% CI:0.0001~0.852, p = 0.042) were found to be risk factors for EGFR mutations. The AUC of the radiomic feature only and combination of clinical presentations to predict EGFR mutation was 0.575 and 0.664, respectively (Fig. 5).
Fig. 5
Fig. 5

ROC curve of the radiomic GLSZM feature termed SizeZoneNonUniformityNormalized and combination of radiomic feature and clinical feature to predict EGFR mutation


In this study, we attempt to assess the association of radiomics features with EGFR exon 19 and 21 mutations of lung adenocarcinomas, respectively. A separate analysis of EGFR exon 19 and 21 mutations may facilitate personalized treatment of lung adenocarcinomas. Moderate diagnostic performance was obtained from the combination of radiomic features and clinical presentations to predict EGFR exon 19 and 21 mutations of lung adenocarcinomas.

Recently development of personalized treatment to lung carcinoma has attracted more attention to EGFR mutations. Patients with EGFR mutations, especially with exon 19 mutation, showed better prognosis for tyrosine kinase inhibitors treatment. Therefore, acknowledge of EGFR mutation status is essential for personalized treatment. Epidemiology studies have shown that several clinical characteristics, such as female, nonsmoker and East Asian origination, were associated with EGFR mutation [14, 15]. At present study, female and nonsmoker were also found to be associated with EGFR mutation, we also identified that age was associated with different exon mutations. Patients with exon 19 mutation tended to be younger (OR = 0.968), while patients with exon 21 mutation tended to be older (OR = 1.027). This phenomenon has not been reported before. One explanation might be racial characteristics. Another explanation might be selection bias introduced by the retrospective study nature. No clinical features can be used to predict EGFR mutation precisely. Histopathological samples are most common used for probing EGFR mutation status. Because lung cancers are very heterogeneous, histopathological analysis may introduce sampling bias. Especially in some patients, only biopsy samples can be obtained, which may contain a few tumor cells. Therefore, an ease accessible modality which can evaluate the whole tumor at once examination is needed.

CT, as the most common modality for lung cancer, can obtain abundant structure information of the whole tumor at one scanning. In a study, Rizzo and colleagues investigated the association between conventional CT features and EGFR, ALK, KRAS mutations in non-small cell lung cancer [7]. They found that some traditional CT features, including air bronchogram, pleural retraction, small lesion size and absence of fibrosis, were associated with EGFR mutations in non-small cell lung cancer. Even though in the study, a AUC of 0.8235 was obtained after combining CT features and clinical characteristics to predict EGFR mutations, exon mutations of EGFR were not analyzed separately and abundant digital information of CT images beyond perceivable by the radiologists’ naked eyes were not included. Radiomics, termed as high throughput extraction of medical imaging characteristics from digital imaging data, has emerged recently and shows its promising ability for an improved decision support [1619]. Radiomics also shows the ability to serve as a bridge between medical imaging and precise medicine [20]. In lung cancer, several studies had demonstrated the possibility to use radiomic features as biomarkers for patients’ outcome or genetic characteristics. To the best of knowledge, the relationship between radiomic features and EGFR, especially regarding exon mutations separately, had not well established. In a study, Ozkan [21] and colleagues explored the association between CT gray-level texture features and EGFR mutation status in a small patient group of 25 patients with EGFR mutation and 20 patients with EGFR wild type. Although they identified several CT gray-level texture features were associated with EGFR mutations, the small sample size might introduce selection bias. In another study, Liu et al. included 298 patients and obtained an AUC of 0.709 for predicting EGFR mutation when radiomic features were combined with clinicopathological characteristics. In the study, exon mutations were not treated separately. In this study, a similar sample size as Liu et al. was explored and exon 19 and exon 21 mutations were analyzed respectively. For clinical characteristics, we found that female and nonsmoker status were associated with high prevalence of EGFR mutations and exon 21 mutation, while younger patients were like to have exon 19 mutation. Different radiomic features were associated with EGFR mutation status. Using logistic regression, we identified several clinical characteristics and radiomic features were risk factors for EGFR mutation status. Moderate diagnostic performance was obtained after combination of clinical and radiomic risk factors (AUC of 0.655, 0.675 and 0.664 for exon 19 mutations, exon 21 mutations and the whole exon mutations, respectively). Due to lacking standard, different studies use different methods for texture analysis and many software are developed in-house. Therefore, the reproduction and comparison are difficult, even impossible. In this study, we use an open-source software called PyRadiomic, which is developed from Python and the all the source code and documents can be accessible at We wish our data can be comparable with other studies or future studies using the same software.

Our preliminary study showed that radiomic features extracted from CT images might be promising biomarkers to predict EGFR mutations of lung cancer in vivo. Because lung cancers are high heterogeneous, gene-expression profiling based on tissue specimens may have sampling errors, especially for biopsy specimens. Radiomic features can overcome the limitation to capture intratumoral heterogeneity in a non-invasion three-dimension manner. Harnessing these radiomic features can aid decision making in clinical practice, such as guiding biopsy and treatment selection. Therefore, radiomics signatures from CT images might be a powerful tool for precision diagnosis and treatment of lung cancer.

Several limitations presented at this study. At first, it was retrospective study and there was patient selection bias. Second, the tumor identification was determined by one radiologist who known the operation recorder when there were more than one pulmonary lesion. Therefore, the operator-dependence might be introduced. In addition, traditional radiological findings were not included in present study. Rizzo and colleagues reported that some traditional radiological findings were associated with genotype of lung cancers [7]. Therefore, combination of the traditional radiological findings and radiomic features might improve the diagnosis performance to predict EGFR mutation statuses. Another limitation was that age was found to be associated with different exon mutations at present study, but the underlying reason failed to be identified.


In summary, this study showed that several radiomic features were associated with EGFR mutation statuses in lung adenocarcinomas. Even though after combination of clinical characteristics and radiomic features, only moderate diagnostic performance was obtained, radiomic features might harbor potential surrogate biomarkers for identification of EGFR mutations. Further radiogenomic studies with large sample size are needed to nail down those features, which can predict the EGFR mutation in lung adenocarcinomas prospectively.



Area under the curve


Confidence interval


Epidermal growth factor receptor


Gray-level-co-occurrence matrix


Gray-level-run-length matrix


Gray-level size zone matrix


Non–small cell lung cancer


Odds Ratio


Picture archive and communication system


Progression-free survival


Receiver operating characteristic.


Region of interest.



Not applicable.


No fund for this manuscript.

Availability of data and material

Please contact author for data requests.

Authors’ contributions

DM and YL carried out the collection and assembly of data and drafted the manuscript. JY carried out the immunoassays. YW participated in the design of the study and performed the statistical analysis and helped to draft the manuscript. JG conceived of the study, and participated in its design and coordination and helped to draft the manuscript. All authors read and approved the final manuscript.

Ethics approval and consent to participate

This study was approved by the Medical Ethics Committee of Shenzhen People’s Hospital and informed consent was waivered due to retrospective nature.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Authors’ Affiliations

Department of Radiology, Shenzhen People’s Hospital, the Second Clinical Medical College, Jinan University, Shenzhen, 518020, Guangdong, China
Department of Radiology and Biomedical Imaging, University of California San Francisco, 185 Berry St, Suite 350, San Francisco, CA 94107, USA


  1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2017. CA Cancer J Clin. 2017;67(1):7–30.View ArticleGoogle Scholar
  2. Travis WD. Pathology of lung cancer. Clin Chest Med. 2011;32:669–92.View ArticleGoogle Scholar
  3. Devarakonda S, Morgensztern D, Govindan R. Genomic alterations in lung adenocarcinoma. Lancet Oncol. 2015;16(7):e342–51.View ArticleGoogle Scholar
  4. Riely GJ, Pao W, Pham D, et al. Clinical course of patients with non-small cell lung cancer and epidermal growth factor receptor exon 19 and exon 21 mutations treated with gefitinib or erlotinib. Clin Cancer Res. 2006;12:839–44.View ArticleGoogle Scholar
  5. Mok TS, Wu YL, Thongprasert S, et al. Gefitinib or carboplatin-paclitaxelin pulmonary adenocarcinoma. N Engl J Med. 2009;361:947–57.View ArticleGoogle Scholar
  6. Aerts HJWL. The potential of radiomic-based phenotyping in precision medicine: a review. JAMA Oncol. 2016;2:1636–42.View ArticleGoogle Scholar
  7. Rizzo S, Petrella F, Buscarino V, et al. CT radiogenomic characterization of EGFR, K-RAS, and ALK mutations in non-small cell lung cancer. Eur Radiol. 2016;26(1):32–42.View ArticleGoogle Scholar
  8. Liu Y, Kim J, Balagurunathan Y, et al. Radiomic features are associated with EGFR mutation status in lung adenocarcinomas. Clin Lung Cancer. 2016;17(5):441–8.View ArticleGoogle Scholar
  9. Sharma SV, Bell DW, Settleman J, Haber DA. Epidermal growth factor receptor mutations in lung cancer. Nat Rev Cancer. 2007;7:169–81.View ArticleGoogle Scholar
  10. Jackman DM, Yeap BY, Sequist LV, et al. Exon 19 deletion mutations of epidermal growth factor receptor are associated with prolonged survival in non-small cell lung cancer patients treated with gefitinib or erlotinib. Clin Cancer Res. 2006;12:3908–14.View ArticleGoogle Scholar
  11. Zhu JQ, Zhong WZ, Zhang GC, et al. Better survival with EGFR exon 19 than exon 21 mutations in gefitinib-treated non-small cell lung cancer patients is due to differential inhibition of down stream signals. Cancer Lett. 2008;265:307–17.View ArticleGoogle Scholar
  12. Sequist LV, Bell DW, Lynch TJ, Haber DA. Molecular predictor of response to epidermal growth factor receptor antagonists in non-small-cell lung cancer. J Clin Oncol. 2007;25:587–95.View ArticleGoogle Scholar
  13. van Griethuysen JJM, Fedorov A, Parmar C, et al. Computational radiomics system to decode the radiographic phenotype. Cancer Res. 2017;77(21):e104–7.View ArticleGoogle Scholar
  14. Paez JG, Jänne PA, Lee JC, et al. EGFR mutations in lung cancer: correlation with clinical response to gefitinib therapy. Science. 2004;304:1497–500.View ArticleGoogle Scholar
  15. Kosaka T, Yatabe Y, Endoh H, et al. Mutations of the epidermal growth factor receptor gene in lung cancer biological and clinical implications. Cancer Res. 2004;64:8919–23.View ArticleGoogle Scholar
  16. Gillies RJ, Kinahan PE, Hg H. Radiomics: images are more than pictures,they are data. Radiology. 2016;278:563–77.Google Scholar
  17. Huang Y, Liu Z, He L, et al. Radiomics signature: a potential biomarker for the prediction of disease-free survival in early-stage (I or II) non—small cell lung cancer. Radiology. 2016;281:947–57.View ArticleGoogle Scholar
  18. Lambin P, Rios-Velazquez E, Leijenaar R, et al. Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer. 2012;48:441–6.View ArticleGoogle Scholar
  19. Kumar V, Gu Y, Basu S, et al. Radiomics: the process and the challenges. Magn Reson Imaging. 2012;30:1234–48.View ArticleGoogle Scholar
  20. Lambin P, Leijenaar RTH, Deist TM, et al. Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol. 2017;14(12):749–62.View ArticleGoogle Scholar
  21. Ozkan E, West A, Dedelow JA, et al. CT gray-level texture analysis as a quantitative imaging biomarker of epidermal growth factor receptor mutation status in adenocarcinoma of the lung. AJR Am J Roentgenol. 2015;205:1016–25.View ArticleGoogle Scholar


© The Author(s). 2018