Multi-window CT based Radiomic signatures in differentiating indolent versus aggressive lung cancers in the National Lung Screening Trial: a retrospective study

Background We retrospectively evaluated the capability of radiomic features to predict tumor growth in lung cancer screening and compared the performance of multi-window radiomic features and single window radiomic features. Methods One hundred fifty lung nodules among 114 screen-detected, incident lung cancer patients from the National Lung Screening Trial (NLST) were investigated. Volume double time (VDT) was calculated as the difference between continuous two scans and used to define indolent and aggressive lung cancers. Lung nodules were semi-automatically segmented using lung and mediastinal windows separately, and subtracting the mediastinal window region from the lung window region generated the difference region. 364 radiomic features were separately exacted from nodules using the lung window, the mediastinal window and the difference region. Multivariable models were conducted to identify the most predictive features in predicting tumor growth. Clinical information was also obtained from the database. Results Based on our definition, 26% of the cases were indolent lung cancer. The tumor growth pattern could be predicted by radiomic models constructed using features obtained in the lung window, the difference region, and by combining features obtained in both the lung window and difference regions with areas under the receiver operator characteristic (AUROCs) of 0.799, 0.819, and 0.846, respectively. The multi-window feature model showed better performance compared to single window features (P < 0.001). Incorporating clinical factors into the multi-window feature models showed improvement, yielding an accuracy of 84.67% and AUROC of 0.855 for distinguishing indolent from aggressive disease. Conclusions Multi-window CT based radiomics features are valuable predictors of indolent lung cancers and out performed single CT window setting. Combining clinical information improved predicting performance. Electronic supplementary material The online version of this article (10.1186/s40644-019-0232-6) contains supplementary material, which is available to authorized users.


Background
Lung cancer is the leading cause of cancer-related deaths among both men and women in the U.S. [1]. Screening and early detection of high-risk individuals, based on age and smoking history, can detect lung cancer at an earlier, more treatable stage, and has been shown to improve lung cancer survival rates [2,3]. Specifically, the National Lung Screening trial (NLST) demonstrated a 20% reduction in lung cancer mortality among high-risk individuals screened with low-dose computerized tomography (LDCT) screening versus those screened with standard chest X-ray [4]. Based on the findings from the NLST, the U.S. Preventive Services Task Force issued a recommendation for annual lung cancer screening by LDCT [5].
Despite the mortality reduction benefit associated with lung cancer screening, there are concerns that a subset patients diagnosed with lung cancer in the screening setting may be due to overdiagnosis of slow growing, indolent cancer that may pose no threat and result in overtreatment [2,[6][7][8][9]. In the NLST, prior studies estimated that 18 to 22.5% of screen-detected cancers would not become symptomatic in a patient's lifetime and would remain as indolent lung cancer [7]. Additionally, there have been several other screening studies that also estimated a range of indolent lung cancer rates to be between 2 and 25% [8][9][10]. Although the methodologies and cohort sizes may vary, the existence of indolent lung cancer in lung cancer screening poses an important public health concern. Overdiagnosis of indolent lung cancer results in additional, unnecessary screening, increased costs, higher levels of radiation exposures, undue stress for patients and their families, and unnecessary morbidity that is sometimes associated with overtreatment. Also, prior studies have shown that small indeterminate lung nodules (< 4 mm), which did not reach the criteria to be considered a positive screen in the NLST, that develop into lung cancer in subsequent screening intervals are associated with poorer survival and higher lung cancer mortality compared to those who had a baseline positive screen because of potentially aggressive growth in a relatively short amount of time (1 to 2 years) [11][12][13]. As CT imaging has an important role in the longitudinal clinical management of lung lesions, it is critical to find additional imagingbased biomarkers that could distinguish biologically indolent and aggressive lung cancer at an early stage of development and optimize the scan interval to reduce both overdiagnosis and underdiagnosis.
Radiomics has emerged as a powerful approach to characterize and quantify pulmonary nodules. By providing information on nodule size, shape, and spatial and temporal tumor heterogeneity, Radiomic features can be applied for risk prediction, diagnostic discrimination and disease progression [14][15][16][17]. Compared to conventional radiology practices based on visual interpretation, radiomics is the process of converting standard-of-care medical images into high-dimensional quantitative features that are mineable either by conventional biostatistical approaches or machine learning methods.
To date, few studies have been performed to investigate the association between radiomics and growth rate of lung nodules. Moreover, currently published radiomics work in lung nodules has focused on images acquired with single CT window, usually the lung window. Lee et al. [18] and Sajin et al. [19] showed that the different parts of lung nodules recognized by two CT windows (lung window and mediastinal window) were associated with different pathological components. In addition, some studies found that the ratio of disappearance tumor area between the mediastinal window setting and the lung window setting is related to clinical-pathologic characteristics and tumor aggressiveness and is a significant independent prognostic determinant for small lung adenocarcinoma [20,21]. The motivation for our study comes from conventional radiology, which commonly cycles between both windows to improve diagnostic accuracy. Thus, we hypothesized that highly heterogeneous tumor with different morphology of lung cancer should be reflected with the use of different CT windows settings and multi-window CT based quantitative descriptors could provide an improved prospective clinical predictor for lung cancer screening. Therefore, we performed a radiomic analysis to identify image biomarkers to reveal differences between these two windows and to predict growth patterns of lung cancers in the lung cancer screening setting.

Study population
We obtained the LDCT images and clinical information for the NLST from the Cancer Data Access System (CDAS) [22]. The NLST study design, patient enrollment has have previously documented [4,23,24]. In brief, a total of 53,454 participants who are high-risk of lung cancer, with a smoking history of 30 pack years (former smokers or those who quit with less than 15 years) and 55 years or older were randomly assigned to LDCT or radiography examination and administered with baseline and two annual follow-up scans. Exclusion criteria included previous lung cancer history, undergoing chest CT within 18 months before enrollment and having an unexplained weight loss of more than 6.8 kg in the preceding year. If the lung cancer diagnosis was confirmed, the participants would be treated and left the following screening examination. This retrospective study was approved by the Institutional Review Board (IRB) at University of South Florida (USF) and informed consent was waived.
The present study used subset of patients that has been described in prior studies from our group [16,25,26]. Briefly, we identified 314 screen-detected, incident lung cancer patients, who were not diagnosed with lung cancer at baseline screening, but were diagnosed with lung cancer at either the first follow-up screening interval or second follow-up screening interval. These lung cancer cases were derived from prior published nested case-control studies described in [16,26]. However, 200 cases were excluded for the following reasons: complete volumetric image sets were not available, the nodules at the baseline could not be identified using the location information provided by the publically available NLST data, and cases for which it is difficult to exactly contour the tumor margin at any CT window. As such, the final analytical cohort of incident lung cancer patients included 114 patients with 150 lesions. Among the 114 patients, 36 patients had imaging studies conducted for three time points (i.e., baseline, the first follow-up. and the second follow-up). Self-reported patient clinical data from the NLST used in this analysis were age at randomization, sex, pack-years smoked, family history of lung cancer, smoking status, and history of COPD.

Volume-doubling time (VDT) and tumor growth patterns
Volume-doubling time (VDT) of a non-calcified nodule was used as the criteria for classifying indolent lung cancers versus aggressive lung cancers. Volumes were calculated at the baseline screen and all available followup screening intervals. And VDT for each nodule was calculated using the fowling equation: Where T i means interval time between two scans, V 0 refers to the volume of the first scan, and V i refers to the volume of the second scan.
Nodules with a VDT more than 400 days were classified as indolent/slow-growing lung cancer, and nodules with a VDT less than 400 days were classified as aggressive/fast-growing lung cancers.

Tumor segmentation and Radiomic feature extraction
All lung nodules were reviewed and segmented by two clinical radiologists (H.L. and J.Q. with 15 and 12 years of experience in thorax imaging, respectively), who was aware of malignancy status but were blinded to clinical information and growth status. Lesions were identified and segmented using the Quantitative Imaging Decision Support (QIDS)® Platform (HealthMyne, Madison, WI) to delineate the tumor regions for this study. After identifying lesions and dragging the line along the longest diameter, a 2D delineation preview is presented to the user for editing or confirmation. Once confirmed the 2D delineation, a 3D segmentation is automatically performed, after which the boundaries can then be edited and confirmed. Manual editing occurred in about 8% of the nodule volumes because of pleural or fissure or vessel attachment. Each nodule was segmented under both standard lung window (window width 1500 Hu, window level, − 400 Hu) and mediastinal window (window width 400Hu, window level, 40Hu). All segmented images were reviewed by 2 radiologists in consensus and any discrepancies were discussed to reach consensus.
The two tumor masks (standard lung window mask and mediastinal window mask) were imported into MATLAB. The difference regions between the two windows ( Fig. 1), voxels that appear in lung window but not the mediastinal window, were obtained and then radiomic features were obtained from the two different masks: standard lung window mask, difference region mask. Radiomics features were extracted using an inhouse texture extractor implemented with MATLAB 2016b (MathWorks, Natick, USA). For each mask, 364 features were extracted, including 209 IBSI features according as previously described [27,28], 125 Laws features and 30 wavelet features (Additional file 1: Table S1).

Statistical analysis
To reduce the number of radiomic features, two separate dimensionality reductions were conducted. First, the Student's t-test was performed for each feature comparing indolent lung cancers versus aggressive tumor. Statistically significant radiomic features (p-value < 0.05) were included. Next, the area under the receiver operating characteristic (AUROC) was calculated for each feature with Bootstrap resampling at 200X and features with a mean AUROC > = 0.5 were included. Radiomic features that were both statistically significant by the Student's t-test and possessing an AUROC > = 0.05 were then tested for correlation using Pearson's coefficient. Among correlated features that had a Pearson's coefficient > =0.8, the feature with the largest mean AUROC was selected. The final features were then reduced using a backward elimination logistic regression approach (0.05 for entry and 0.10 for removal). Using this approach, three individual models were constructed using the lung window features, difference region features, and the combination of features derived from the lung window and the difference region. These were used to yield 3 distinct radiomics scores. Finally, we included patient information (sex and self-reported history of COPD) to the radiomics score based model to investigate the incremental complementary value to improve the predictors. All statistical tests were 2-sided. A p-value of less than 0.05 was considered statistically significant.

Results
The patient demographic data are presented in Table 1. There were totally 39 (26%) nodules classified as indolent lung cancer (median VDT 583 days) compared to 111 (74%) nodules classified as aggressive (median VDT 148 days). There were 36 patients who had a baseline screening and two follow-up screens, among of which 17 patients exhibited mixed growth pattern during the two follow-up screening intervals. And 12 nodules from the first to second follow-up were re-classified from indolent to aggressive, while 5 nodules were re-classified from aggressive to indolent cancer (Fig. 2).
In our dataset, the volume of the nodule in lung window was in the range of 4.12~68.74 mm 3 , while the volume of the nodule in mediastinal window was in the range of 0~56.40 mm 3 . Volume was significantly different between the two groups, but was excluded at the final prediction model in the feature selection. There were significant differences in sex and self-reported COPD between indolent and aggressive lung cancers (Table 1). Female patients were much more likely to have indolent cancers (70.00% vs 31.17%) than male patients (P = 0.006). Concerning history of COPD, indolent lung cancers were more frequent in patients without history of COPD compared with aggressive lung cancers (P = 0.035). There were no differences in age (P = 0.196), pack-years smoked (P = 0.704), family history of lung cancer (P = 0.386), and smoking status (P = 0.309) between indolent and aggressive lung cancers. The AUROC of multivariable logistic regression model generated with the clinical features alone was 0.742(95% CI, 0.66 to 0.83), with accuracy of 62.00%, specificity of 54.05% and sensitivity of 84.62%.
The most informative radiomic features predicting lung cancer growth pattern were obtained from lung window and difference region between lung and mediastinal windows. The multivariable logistic regression model using radiomic features obtained in the difference region had better predictive power than the features from any single lung window ( Table 2). The AUROC based on difference region features was 0.820 (95% CI, 0.74 to 0.90), with accuracy of 73.33%, specificity of 79.49% and sensitivity of 71.17%, while the AUROC based on single lung window features was 0.800 (95% CI, 0.72 to 0.88), with accuracy of 81.33%, specificity of 66.67% and sensitivity of 86.49%, When these two sets of features were combined, the AUROC was increased to 0.845 (95% CI, 0.77 to 0.92), with accuracy and sensitivity improved to 83.33 and 84.68%, respectively. Bootstrap re-sampling for internal validation was conducted and the odds and performance statistics did not change to a significant extent, with the AUROC based on difference region features, lung window features and combined these two settings features were 0.819 (95% CI, 0.742 to 0.90), 0.700 (95% CI, 0.72 to 0.88) and 0.846 (95% CI, 0.77 to 0.92), respectively ( Table 2 and Fig. 3). We also report the improved incremental predictive value with the use of clinical information, which includes sex and history of COPD. The nomogram models generated with combined clinical and radiomic features (Fig. 3) were superior to the models created with radiomic features alone or clinical characteristic alone ( Table 2 and Fig. 4).

Discussion
Using LDCT images and data from the NLST, we extracted radiomic features and calculated VDTs using a multi-window approach to identify features associated with tumor growth. Overall, radiomic features extracted from the combined window yielded a highly predictive model to discriminate indolent from aggressive lung cancers which yielded an AUROC of 0.85 and accuracy of 84.67%. The model derived from the combined window features resulted in better performance statistics compared to the models derived from the lung window and difference region only. Combining the most predictive radiomics features and demographic risk factors into a radiomics nomogram demonstrated the translation implication for individualized tumor growth speed estimation. As such, these data demonstrate that multiwindow CT based radiomics features are valuable in improved personalization and precision screening and management of lung cancer. Now that LDCT imaging is approved for screening and early detection of lung cancer, the implications of identified high rates of indolent cancers is a real-life concern. Bach [29] proposed a bipartite natural-history model of lung cancer, which classifies lung cancer into indolent versus aggressive as unique separate entities. However, the exact definition of indolent lung cancer is not uniform or consistent across studies. In NLST [7], indolent lung cancers were defined as the surplus set of cancers compared to standard chest radiography arm. In the Pittsburgh Lung Screening Study (PluSS) [10], Thalanayar et al. combined volume (VDT ≥ 400 days) and PET (maximal standardization uptake ≤1) information to define indolence and estimated a prevalence of 18.5%. Yankelevitz et al. [9] calculated the VDT (VDT ≥ 400 days) based on the size measurement of recorded in MLP (Mayo lung project) and MSK (the Memorial Sloan Kettering Cancer Center trial) studies to evaluate the indolent cases on chest radiography screening and 2 to 7% of indolence was identified. Using a similar definition, Lindell et al. [6] retrospectively evaluated the indolence in the LDCT screening of 5 years and reported a rate of 25%. In the Continuous Observation of Smoking Subjects (COSMOS) study [8], Veronesi et al. used VDT(VDT ≥ 400 days or 600 days) from volume to define indolent lung cancer or slow-growing, and suggested that cancer with a VDT of 400 days or more could be overdiagnosed.  Compared to the VDT from 2-dimention analysis, the VDT from 3-dimention has well reproducibility [30]. Volume changes estimated from the 2-dimention diameter may miss information of asymmetric growth [31]. Moreover, VDT has also significant association with lung cancer risk and lung cancer-specific mortality [8,32]. Assessment of VDT was valuable in reducing false positives [33]. So VDT is a reliable and directive indicator of cancer aggressivity. In our study, using VDT from volumetric analysis as criteria, about 26% lesions were diagnosed as indolent lung cancer with median VDT 583 days, which were similar with previous report [6][7][8]. Recognizing these lung cancer with different growth pattern would be helpful in defining the time interval of following up to reduce the cost of screening and overtreatment for indolent lesions, at the same time, avoiding delaying the most better treatment opportunity for aggressive lung cancer.
In our analysis we found that 47% of the nodules exhibited inconsistent growth pattern between two time periods (i.e., baseline to first follow-up versus first follow-up to second follow-up), and 2 lesions became smaller in volume at some time point. Similar findings were also reported by previous studies [6,34]. In Lindell's [6] five-year lung cancer screening study, he reviewed the growth curves of 18 lung cancers with at least four times CT scans and found the growth appearance of lesions stratified with CT scan attenuation, survival and size were vary. He also found 4 tumors reduced during the follow up, including two bronchioloalveolar carcinoma and two non bronchioloalveolar carcinoma. Similarly, Leo [34] also reported a rare regression of lung cancer without any intervention. Classically, lung cancer evolution was according to the exponential growth model, but there is increasing evidence shows that the natural history of lung malignant nodules does not always fit this model. The complex interaction   between stem cell and the microenvironment of the tumor and the immune system play an important role in tumor progression [35]. Our findings suggested the status evaluation of lung cancer at one time point may not always predict tumor growth and even mislead the lung nodule management. As such, non-invasive imagingbased predictors of tumor growth at different time point, as presented in our analysis, should be helpful to assist in identifying different growth pattern of lung cancer and selecting personalized follow up interval during lung cancer screening. Although radiomics feature have been utilized in lung cancer risk prediction and diagnosis [14][15][16], our current analysis is the first to evaluate growth pattern of lung cancers using multi-window CT radiomic features. With the large amount of objective quantitative metrics extracted either from entire tumor or a particular interest of area within tumors, radiomics depict the intratumoral heterogeneity, which subjective radiologic descriptors are inadequate to capture, and are used to evaluate and monitor tumor cell evolution over time. However, most current quantitative metrics lack spatialness, especially for the lung LDCT scan, and most radiomics analysis of lung nodules are based on single lung window CT images. The spatially explicit analysis of tumor regions is a potential emerging key point of cancer imaging [36]. In the present study, we proposed "window" as a practical and objective way to define the lung tumor habitat spatially and extract radiomic features from lung window, mediastinal window and difference region between these two window settings separately. Although the most informative features in distinguishing indolent and aggressive lung cancer were from the lung window and the difference region (data not shown), the multi-window based difference region model had the better performance statistics (Table 2). Moreover, compared to the single lung window, the combined predictive model based on multi-window CT images resulted in statistically better performance, with the AUROC reached 0.85. The different CT window setting would play different role in describing lung cancer physiology; however, the relationships between quantitative imaging and pathology remains poorly understood to date. Some studies investigated that the solid portion of lung cancer in the mediastinal window was associated with the adenocarcinoma invasiveness and using mediastinal window setting criterion could improve the interobserver agreement in classifying the subsolid lung nodule [18,19,37]. Okada et al. [20] found the ratio of the tumor area of the mediastinal window to that of the lung window was prognostic. The 5-year survival was 48% in cases with a ratio of 0 to 25%, 87% with a ratio of 26 to 50%, 97% with a ratio of 51 to 75%, and 100% with a ratio of 76 to 100%. Moreover, the higher disappearance ratio of two CT window settings also related to less lymphatic, vascular vessel invasion, or nodal involvement. Thus, the difference region between lung window and mediastinal window showed the potential to identify the clinical-pathologic characteristics and aggressiveness of lung cancer. Our results support this conclusion. The mechanistic explanation for this observation is not known; however, the observation could be attributed to that most of the discrepancy region between two CT window settings are located in the peripheral of tumor, where the active regions of tumor stem cell are interacting with their surrounding microenvironment. Future work is needed to elucidate these findings and cumulatively these results provide further clues to explore the role of window-based radimoics features in improved personalization and precision medicine.
We also found that sex and history of COPD were significantly different between indolent lung cancer and aggressive lung cancer and that by including this information with the radiomics nomogram (shown in Fig. 4) improved prediction capabilities. As for sex-based difference in growth speed, our results were consistent with the following studies. Hasegawa et al. [38] revealed the mean VDT of lung tumor was longer in women (559 days for women and 387 days for men). Lindell et al. [6] got the greater difference between the sexes (688 days for women and 234 days for men) and thought the women had higher incidence of slow-growing or indolent lung cancer for histology type. The link between COPD and lung cancer has garnered substantial concerning over the past decade years and many epidemiological studies have consistently demonstrated an increased incidence of lung cancer in patients with history of COPD [39] [40]. The association between CDPD and tumor growth has little konwn, and our analyisis revealed that the incidence of COPD was lower in indolent lung cancer than that in agreesive lung cancer. This finding support the COSMOS study [6], which indicated that the slow-growing or indolent lung cancer was more common in low-risk persons.
We acknowledge some limitations of this analysis. First, the sample size was modest because of strict inclusion criteria. Also, we did not stratify the lung nodules according to the attenuation, because the discrepancy between the two CT window settings had already included the density information. Next, the participants of NLST were from different U.S. medical centers and the CT scanning parameters were not consistent, however, which would be the superiority for the extracted features to generalize to other screening or incidentally-detected lung cancer cohort. Although we performed backward-elimination bootstrapping for internal validation of our final models, further independent validation cohort across institutions would be helpful to confirm these findings.