Patients
This retrospective study was performed with institutional review board ethical approval. From December 2012 to May 2018, we identified 275 consecutive lung-pGGN patients (MIA; N = 167, IAC; N =108) who underwent preoperative chest TSCT, and were pathologically confirmed as single MIA and IAC after thoracic surgery. After screening, 100 cases with P-pGGN were finally included. The clinical characteristics of all P-pGGN cases were recorded (e.g., age, sex, smoking history, etc.).
Inclusion criteria: (A) patients having completed a lung TSCT scan 2 weeks before surgery; (B) pGGNs on lung window images (window width; 1500 HU, window level; − 600 HU), (C) single MIA and IAC pathologically confirmed by thoracic surgery, with accompanying histopathological specimens, (D) P-pGGNs located in the sub-pleural area displayed on preoperative TSCT, and defined as a pGGN attached or connected to the pleural surface, including visceral and interlobar pleura. Exclusion criteria: (A) patients having undergone tumor therapy (radiotherapy, chemotherapy, etc.), puncture biopsy, or surgical resection before the TSCT scan, (B) unavailable image archiving and communication systems, and (C) visible soft-tissue attenuation within the lesion, viewed on mediastinal window images (window width = 400 HU, window level = 40 HU).
CT scan acquisition
All patients underwent chest CT examinations without intravenous contrast-media injection, with > 16 rows of spiral CT (GE Healthcare; GEHD 750; Somatom Perspective). Chest scans were performed on patients whose hands were at either side of the head in a supine position from the upper supraclavicular area to the lower adrenal area at deep inhalation and breath-holding moment. Scanning was performed in conventional helical mode, at a tube voltage of 120kVp, tube current 170–200 mA, slice thickness 5.0 mm, slice interval 5.0 mm, matrix 512× 512, bone algorithm for reconstruction, slice thickness 1.0–1.5 mm, and slice interval 1.0–1.5 mm.
Pathological analysis
All pathological evaluations were performed by examining hematoxylin/eosin (HE) stained slides. These were prepared using formalin-fixed paraffin-embedded tissues with 0.4 cm thick sections, including the largest section of the tumor. All tumors were histologically evaluated by two experienced pathologists blinded to patient pathological information. The pathological type of each lesion was recorded. Pathological IAC and MIA diagnoses were performed according to the new ADC classification proposed by the IASLC/ATS/ERS in 2011. When opinions were divergent regarding morphology, discrepancies were resolved by consensus.
Conventional image analysis
All raw CT images were retrieved on a picture archiving and communication system (PACS, DJ Health Union Systems Corporation), and observed by two experienced chest radiologists blinded to histological findings in the lung window (window width = 1500 HU, window level = − 600 HU). TSCT P-pGGN images were evaluated and the following imaging features were recorded.
Conventional morphological characteristics
(1) Tumor location: left superior lobe, left inferior lobe, right superior lobe, right middle lobe, right inferior lobe; (2) Shape: round and oval, irregular; (3) Tumor-lung interface: clear or not; (4) Lobulation signs: defined as a portion of the edge of the lesion; is wavy or fan-shaped; (5) Spiculate margin: defined as a thin line extending from the edge of the nodule to the lung parenchyma, but reaching the pleural surface; (6) Bubblelike appearance: defined as 1–3 mm cystic transparency of air attenuation within nodules; and (7) Air bronchogram: defined as lucency along the regular bronchial wall inside the P-pGGN.
Conventional quantitative CT features
(1) TLD on the largest axial plane (LAP): LAP was selected from the axial TSCT image on the lung window, and the maximum diameter on the LAP was determined as TLD; (2) Tumor short diameter (TSD) on the LAP. The TLD vertical diameter was determined on the LAP as the TSD; (3) Tumor vertical diameter (TVD) on the largest coronal plane (LCP) [24]: LCP was selected from the coronal TSCT image on the lung window, and the largest diameter on the LCP was measured as TVD; (4) CT value on the LAP (CT-LAP): the CT value on the LAP was measured as CT-LAP; (5) Relative CT value on the LAP (RCT-LAP) [19]: the normal lung density measured on the same plane with LAP (NLD-LAP) was divided by the CT-LAP value as RCT-LAP.
Measurement standards for conventional quantitative CT features: (1) The region of interest (ROI) of the P-pGGN was delineated by a regular curve. The ROI should include > 70% of the lesion area of the P-pGGN. (2) ROI measurements should avoid large vessels, bronchus and vacuoles, when there are bronchovascular bundles and vacuoles in the measurement layer. Sub-maximum layers should be selected when they cannot be completely avoided. (3) Selection of NLD-LAP: The same lobe and subpleural area at the same level of the lesion were selected. We completed ROI measurements by delineating the similar pulmonary microvascular attenuation and area as the CT-LAP measured.
Radiomics analysis
Segmentation and radiomic feature extraction
All TSCT image layers were manually segmented, and radiomics feature were radiographically extracted using a free open-source software called 3D slicer (version 4.8.1) (https://www.slicer.org/). A total of 106 radiomics features were extracted automatically, and included shape (N = 13), Gray Level Dependence Matrix (GLDM; N = 14), Gray Level Co-occurrence Matrix (GLCM; N = 24), first order (N = 18), Gray Level Run Length Matrix (GLRLM; N = 16), Gray Level Size Zone Matrix (GLSZM; N = 16), and Neighboring Gray Tone Difference Matrix (NGTDM; N = 5), comprising seven categories.
Intra- and inter-observer agreements
Intra- and inter-observer agreements for feature extraction were evaluated using the intra- and inter-class correlation coefficient (ICC). Initially, 50 P-pGGN TSCT images were randomly selected, and ROI segmentation and feature extraction were independently completed by two blinded experienced radiologists. Observer one performed feature extraction on the CT image after an interval of no less than 30 days once more. Inter-observer agreement was assessed by comparing feature extraction measured by observer two, with feature extraction of observer one. The remaining image segmentation was measured by observer one, both independently and manually.
Radiomics feature selection and radiomics signature development
The dataset was assigned to either the training samples or testing samples in a 7:3 ratio. All cases in the training samples were used to select features and train the predictive model, while cases in the test cohort were used to independently evaluate the model’s performance. Before analyses, features were standardized by standardization. ICCs were calculated to determine inter- and intra-observer agreement, and features with ICCs > 0.75 were retained. The least absolute shrinkage and selection operator (LASSO) method was used for regression assessment of high-dimensional data. LASSO was used to derive the most useful predictive features. Optimal features were selected according to the AUC. In LASSO, a 10-fold cross-validation was performed to choose the optimal hyperparameter log (λ), with mean square error as a criterion (the smaller the better). The radiomics signature (Rad-score) was calculated based on the sum of selected features weighted by their corresponding coefficients to predict IAC before surgery for each patient. A 10-fold cross-validation in the training samples was also performed to evaluate the performance and reliability of our model. A logistics model was built from optimal feature subsets of the training sample.
Evaluation of the radiomics signature
The AUC of the ROC curve was used to evaluate the predictive accuracy of the radiomics model, in both training and testing samples. A calibration curve was used to demonstrate the calibration degree, which reflected consistency between predicted and observed IAC risks. Decision curve analysis (DCA) was conducted to evaluate the clinical usefulness of the radiomics model (Fig. 1).
Statistical analysis
The Kolmogorov-Smirnov test evaluated whether variables were normally distributed. Variables were described as the mean ± standard deviation (SD) for normal distributions, and described as the median and quartile for non-normal distributions. T-tests were used for normally distributed variables, and the Mann–Whitney U test was used for non-normally distributed variables. A Pearson χ2 test or Fisher exact test was used to test differences between groups in terms of tumor location, shape, tumor-lung interface, lobulation signs, spiculate margins, bubblelike appearance and air bronchograms. ROC curves were plotted to assess the performance of conventional quantitative CT features in differentiating IAC from MIA groups. Accuracy, sensitivity, specificity and AUC were also calculated. LASSO methods were performed using the “glmnet” package. A calibration curve was performed to evaluate the predictive accuracy of the radiomics signature. DCA was conducted to evaluate whether the radiomics signature was sufficiently robust for clinical practice. All statistical analyses were performed using SPSS (version 26.0, IBM, Armonk, NY, USA), R 3.5.1 and Python 3.5.6. A two-tailed P-value < 0.05 indicated statistical significance.