A CT-based radiomics nomogram for differentiation of focal nodular hyperplasia from hepatocellular carcinoma in the non-cirrhotic liver

Background The purpose of this study was to develop and validate a radiomics nomogram for preoperative differentiating focal nodular hyperplasia (FNH) from hepatocellular carcinoma (HCC) in the non-cirrhotic liver. Methods A total of 156 patients with FNH (n = 55) and HCC (n = 101) were divided into a training set (n = 119) and a validation set (n = 37). Radiomics features were extracted from triphasic contrast CT images. A radiomics signature was constructed with the least absolute shrinkage and selection operator algorithm, and a radiomics score (Rad-score) was calculated. Clinical data and CT findings were assessed to build a clinical factors model. Combined with the Rad-score and independent clinical factors, a radiomics nomogram was constructed by multivariate logistic regression analysis. Nomogram performance was assessed with respect to discrimination and clinical usefulness. Results Four thousand two hundred twenty-seven features were extracted and reduced to 10 features as the most important discriminators to build the radiomics signature. The radiomics signature showed good discrimination in the training set (AUC [area under the curve], 0.964; 95% confidence interval [CI], 0.934–0.995) and the validation set (AUC, 0.865; 95% CI, 0.725–1.000). Age, Hepatitis B virus infection, and enhancement pattern were the independent clinical factors. The radiomics nomogram, which incorporated the Rad-score and clinical factors, showed good discrimination in the training set (AUC, 0.979; 95% CI, 0.959–0.998) and the validation set (AUC, 0.917; 95% CI, 0.800–1.000), and showed better discrimination capability (P < 0.001) compared with the clinical factors model (AUC, 0.799; 95% CI, 0.719–0.879) in the training set. Decision curve analysis showed the nomogram outperformed the clinical factors model in terms of clinical usefulness. Conclusions The CT-based radiomics nomogram, a noninvasive preoperative prediction tool that incorporates the Rad-score and clinical factors, shows favorable predictive efficacy for differentiating FNH from HCC in the non-cirrhotic liver, which might facilitate clinical decision-making process.


Background
Hepatocellular carcinoma (HCC) is the most common primary liver cancer and the third most common cause of cancer death worldwide [1,2]. Approximately 80% of cases of HCC occur in patients with liver cirrhosis, arising from hepatitis B and C infections or alcoholism [2,3]. In patients with liver cirrhosis, noninvasive diagnosis of HCC can be established by a characteristic feature of arterial phase hyperenhancement followed by portal venous or delayed phase washout on multiphasic contrast CT or MRI. However, an increasing number of HCC arises in a non-cirrhotic liver [3], probably due to transient hepatitis B infection or due to diffuse liver damage caused by non-alcoholic fatty liver disease. In such non-cirrhotic cases, other benign hypervascular liver lesions (hepatocellular adenoma [HCA] and focal nodular hyperplasia [FNH]) should be taken into the differential diagnosis.
FNH is the second most common benign liver tumour in the non-cirrhotic liver, characterized by nodular hyperplasia of the hepatic parenchyma around a central stellate area of fibrosis associated with a congenital vascular malformation [4][5][6][7]. Typical FNH can be diagnosed with confidence by using multiphasic contrast CT or MRI. Atypical FNH may show less intense enhancement, absence of a central scar, pseudocapsular enhancement on delayed images, as well as the presence of hemorrhage, calcification, or necrosis [8,9], making the differential diagnosis between atypical FNH and HCC rather difficult. The distinction between HCC and FNH is critical as the management differs considerably.
Various imaging modalities have been applied in the distinction between HCC and FNH, such as CT [1,9,10], Doppler ultrasound [11,12] and MRI [1,3,5,[13][14][15]. In previous studies, the gadoxetic acid-enhanced MRI is being increasingly used in differentiating focal liver lesions. HCC generally shows definite hypointensity on the hepatobiliary phase (HBP) because of decreased or absent uptake of gadoxetic acid. On the other hand, FNH commonly shows iso-or hyperintensity on the HBP because of their preserved ability to take up gadoxetic acid. However, 10-15% of HCCs show iso-or hyperintensity on the HBP because of overexpression of organic aniontransporting polypeptide (OATP) 1B3, which is one of the uptake transporters of gadoxetic acid into hepatocytes [1]. Approximately 10-12% of FNHs may not show iso-or hyperintensity on the HBP [7]. The paradoxical uptake or lack of uptake may make the differential diagnosis of HCC from FNH rather difficult.
The purpose of this study was to construct and validate a CT based radiomics nomogram that would incorporate a radiomics signature and clinical factors for the preoperative differentiation between HCC and FNH in the non-cirrhotic liver.

Patients
The institutional review board of our hospital approved this retrospective study with a waiver of obtaining informed consent.
Patients were identified by searching the pathology database from one institution (The Affiliated Hospital of Qingdao University) between June 2008 and February 2019 for the diagnosis of FNH or HCC on surgically resected specimens. A total of 156 patients with FNH (n = 55, 32 men and 23 women; mean age, 31.82 ± 12.55 years) and HCC (n = 101, 85 men and 16 women; mean age, 57.10 ± 9.89 years) were enrolled in this study according to the following inclusion criteria: (1) patients with pathologically proven of either FNH or HCC; (2) patients underwent contrast-enhanced CT less than 15 days before surgery; (3) patients with complete clinical and pathologic data. The exclusion criteria were as follows: (1) HCC patients with CT features of liver cirrhosis (The cirrhotic liver may demonstrate a nodular surface, widened fissures between lobes, an atrophied right lobe, hypertrophy of left lobe and/or caudate lobe and other features including portal vein dilation, portosystemic shunts, splenomegaly, and ascites, etc.); (2) HCC patients received chemotherapy or radiotherapy before surgery; (3) Image quality was unsatisfactory for analysis. The patients were divided into two independent sets: 119 patients treated between June 2008 and January 2017 constituted the training set, whereas 37 patients treated between February 2017 and February 2019 constituted the validation set.
Clinical information including age, gender, hepatitis B and C virus (HBV and HCV) infection and serum alpha fetoprotein (AFP) level (> 400 ng/mL; ≤ 400 ng/mL) were derived from medical records.

CT image acquisition
CT scans were obtained with two 64-slice CT scanners (Somatom Sensation 64, Siemens Healthcare, Erlangen, Germany; Discovery 750, GE Healthcare, Milwaukee, USA) using the following parameters: 120 kVp tube voltage, 200 mAs or 250-400 mA (using automatic tube current modulation) tube current, 64 × 0.6 mm or 64 × 0.625 mm detector collimation, a matrix of 512 × 512, a pitch of 1 or 1.375, a gantry rotation time of 0.5 s and a slice thickness of 5 mm. The scanning area covered the entire liver. An 80-90 mL volume of nonionic contrast agent (Iopromide, Ultravist 370; Bayer, Germany) was administered into the antecubital vein by a power injector at a rate of 2.5 mL/s. Pre-contrast CT was first obtained, followed by three post-contrast CT scans of the liver obtained in arterial phase (AP, 30 s), portal venous phase (PVP, 60 s), and delayed phase (DP, 90-120 s).

CT features analysis
The CT images were analyzed in our Picture Archiving and Communication System (PACS, Version 3.2.8, GE Healthcare, Milwaukee, USA) by two radiologists (Reader 1, P.N; Reader 2, G.Y) with eight and 6 years of abdominal imaging experience, respectively. Blinded to the clinic-pathologic data, the two readers interpreted the following subjective CT features by consensus: the diameter of the tumour on the axial CT image; shape (round or not round); a central scar (present or absent, a "central scar" was defined as a central stellate structure showing low attenuation on unenhanced CT images, hypovascular enhancement on AP and PVP phases and delayed enhancement on DP phase); degeneration (present or absent, "degeneration" was considered as a non-enhancing area on dynamic study due to necrosis or hemorrhage. We supposed that low attenuation on unenhanced CT images corresponded to necrosis, whereas high attenuation on unenhanced CT images indicated hemorrhage); fat deposition (present or absent, "fat deposition" was defined as an area showing fat attenuation on unenhanced CT images); calcification (present or absent); a capsule-like rim (present or absent, "a capsule-like rim" was defined as tumour rim showing low attenuation on unenhanced CT images and hypovascular-delayed enhancement on dynamic studies); dysmorphic vessels (present or absent, "dysmorphic vessels" were regarded as prominent or enlarged vessels in or around the tumour); and enhancement pattern (The enhancement pattern on dynamic CT was classified into early enhancement with a washout pattern, early enhancement with no washout pattern and other patterns. Early enhancement was defined as showing higher attenuation than the background liver on AP. Washout was defined as a nodule showing lower attenuation than the background liver on PVP to DP. No washout pattern indicated that the nodule showed equivalent or higher attenuation than the background liver on PVP to DP. Other patterns referred to the enhancement patterns not mentioned above).

Construction of the clinical factors model
Univariate analysis was applied to compare the differences of the clinical factors (including clinical information and CT features) between the two groups, and a multiple logistic regression analysis was used to build the clinical factors model using the significant variables from the univariate analysis as inputs. Odds ratios (OR) as estimates of relative risk with 95% confidence intervals (CI) were obtained for each risk factor.

Tumour segmentation and radiomics feature extraction
Tumor regions of interest (ROIs) were manually segmented in the largest cross-sectional area using ITK-SNAP software (Version 3.8.0). Contouring was drawn slightly within the borders of the tumours on AP, PVP, and DP, but avoiding covering the adjacent hepatic parenchyma and perinephric fat.
Feature extraction was performed using the Radcloud platform (Huiying Medical Technology Co., Ltd). A total of 4227 radiomics features were extracted from the ROIs. The radiomics features are divided into four groups: (1) intensity statistics features, which consists of 19 features that quantitatively delineate the distribution of voxel intensities within the ROI through commonly used and basic metrics; (2) shape features, including 10 two-dimensional features, are used to reflect the shape and size of the ROI; (3) texture features, composed 59 features calculated by gray level co-occurrence matrix (GLCM), gray level run length matrix (GLRLM) and gray level size zone matrix (GLSZM), quantify the heterogeneity differences of ROI; and (4) filter and wavelet features, which include the intensity and texture features derived from filter transformation and wavelet transformation of the original image, obtained by applying filters such exponential, logarithm, square, square root and wavelet (wavelet-LHL, wavelet-LHH, wavelet-HLL, wavelet-LLH, wavelet-HLH, wavelet-HHH, wavelet-HHL, and wavelet-LLL).
Inter-and intra-class correlation coefficients (ICCs) were used to evaluate the inter-observer reliability and intra-observer reproducibility of feature extraction. We randomly chose 30 cases of CT images (10 FNHs and 20 HCCs), and ROI segmentation was performed by Reader 1 and Reader 2. Reader 1 then repeated the same procedure 1 week later to evaluate the agreement of feature extraction. An ICC greater than 0.75 suggests good agreement of the feature extraction. The remaining image segmentation was conducted by Reader 1.

Construction of the radiomics signature
The radiomics features, which met the criteria of having inter-and intraobserver ICCs greater than 0.75 and being significantly different between the two groups evaluated by one-way analysis of variance (ANOVA), were entered into the least absolute shrinkage and selection operator (LASSO) regression model to select the most valuable features in the training set. The selected features were then combined into a radiomics signature. A radiomics score (Rad-score) was calculated for each patient through a linear combination of selected features weighted by their respective LASSO coefficients.
Development of a radiomics nomogram and assessment of the performance of different models A radiomics nomogram was developed by incorporating the significant variables of the clinical factors as well as the Rad-score. The diagnostic performance of the clinical factors model, the radiomics signature and the radiomics nomogram for differentiating FNH from HCC was assessed by using the area under the receiver operator characteristic (ROC) curve (AUC) in both the training and validation sets. A radiomics nomogram-defined score (Nomo-score) for each patient was calculated in the training and validation sets. To estimate the clinical utility of the nomogram, decision curve analysis (DCA) was performed by calculating the net benefits for a range of threshold probabilities in the training set.

Statistics
Statistical analysis was performed using SPSS (Version 25.0, IBM) and R statistical software (Version 3.3.3, https://www.r-project.org). Univariate analysis was used to compare the differences of the clinical factors between the two groups by using the chi-square test or Fisher exact test for categoric variables, and Mann-Whitney U test for continuous variables, where appropriate. One-way ANOVA was used to compare the value of radiomics features for differentiation of FNH and HCC. The "glmnet" package was used to perform the LASSO regression model analysis. The ROC curves were plotted using the "pROC" package. Nomogram construction was performed using the "rms" package. Differences in the AUC values between these models were analyzed using the Delong test. DCA was performed using the "dca. R." package. P < 0.05 was considered statistically significant.

Clinical factors of the patients and the construction of the clinical factors model
The clinical factors of the patients in the training and validation sets are shown in Table 1. There was significant difference in age, gender, HBV infection, AFP level, central scar, degeneration, capsule-like rim and enhancement pattern between the two groups (P < 0.05), whereas diameter, shape, fat deposition, calcification, and dysmorphic vessels were not significantly different between the two groups (P > 0.05) in the training set. The multiple logistic regression analysis showed that only age (P < 0.001), HBV infection (P = 0.001), and enhancement pattern (P = 0.019) remained as independent predictors in the clinical factors model.

Feature extraction, selection and radiomics signature construction
Of the 4227 radiomics features extracted from AP, PVP and DP CT images, 3441 were shown to have a good inter-and intra-observer agreement, with ICCs from 0.750 to 1.000. 764 radiomics features having significant differences between FNH and HCC (P = 0.001-0.050) were entered into the LASSO logistic regression model to select the most valuable features (Fig.1). Finally, the radiomics signature was built by using 10 features. The Rad-score was calculated using the following formula: Rad-score = 0.0522 × GrayLevelNonUniformityNorm alized. The radiomics nomogram building and assessment of the performance of different models The age, HBV infection, enhancement pattern, and Radscore were incorporated into the radiomics nomogram building (Fig. 2). The diagnostic performance for the clinical factors model, the radiomics signature and the radiomics nomogram is summarized in Table 2. ROC curves of the three models are shown in Fig. 3. In the training set, the AUCs of the radiomics nomogram and the radiomics signature were significantly higher than that of the clinical factors model (both P < 0.001); however, no significant difference in AUC was found between the radiomics nomogram and the radiomics signature (P = 0.253). In the validation set, there were no significant differences in AUC among these three models (the clinical factors model vs. the radiomics signature, P = 0.376; the clinical factors model vs. the radiomics nomogram, P = 0.055; the radiomics signature vs. the radiomics nomogram, P = 0.345). The Nomo-scores for each patient in the training and validation sets are shown in Fig. 4. The DCA (Fig. 5) showed the radiomics nomogram had a higher overall net benefit in differentiating FNH from HCC than the clinical factors model across the majority of the range of reasonable threshold probabilities.

Discussion
The present study shows that the enhanced CT-based radiomics nomogram, which incorporates the radiomics signature and clinical factors, has favorable predictive value for differentiating HCC from FNH in the noncirrhotic liver with the AUC of 0.979 and 0.917, respectively in the training set and validation set. Differentiating HCC from FNH is important to select appropriate treatment and avoid unnecessary interventions. Sufficient clinical and imaging information facilitates the correct distinction of the two lesions. FNH occurs more frequently in young women (male:female ratio = 1:8) [4,7,45]. HCC is often associated with hepatitis virus infection and a higher level of AFP. Five clinical data including age, gender, hepatitis B and C virus infection and serum AFP level were analyzed in this study; and we found that FNH patients had a significantly younger age and female predominance compared with the HCC counterpart, while in the HCC group, there were more hepatitis B virus infectors associated with a higher AFP level. Age and HBV infection were proven as independent predictors by using the multiple logistic regression analysis, which was consistent with previous studies.
Contrast-enhanced CT (CECT) is the first-line imaging modality for the characterization of liver lesions. However, distinguishing an FNH from an HCC on CECT remains a challenge, especially when they lack typical imaging characteristics such as a central scar, suggestive of FNH (reported in about 65% of FNHs larger than 3 cm [6]) or liver cirrhosis, suggestive of HCC. HCC shares overlapping imaging features with FNH in the non-cirrhotic liver. The classic radiological hallmark of HCC is a hyperenhancement on AP and PVP or DP washout. FNH may also present as a hypervascular lesion with intense enhancement and washout on PVP and DP. Therefore, CECT has a limited diagnostic value in the non-cirrhotic liver for the distinction of HCC and atypical FNH.
Various strategies have been proposed to differentiate benign from malignant liver tumours with conventional CT and MR imaging characteristics. Yu et al. [9] enrolled 42 HCCs and 16 FNHs to identify the value of CT spectral imaging in differentiating HCC from FNH during the AP and PVP, and found that the lesion-normal parenchyma iodine concentration ratio in AP had the highest sensitivity (100%) and specificity (100%) in differentiating HCC from FNH. Boas et al. [10] developed and validated a simplified triphasic CT-based model of tumor blood supply that combined hepatic artery and portal vein blood supply coefficients to distinguish benign (n = 32) and malignant (n = 46) liver lesions. In addition to traditional relative enhancement criteria (such as washout), hepatic artery and portal vein blood supply coefficients could be used to classify hypervascular liver lesions, achieving high specificity (97%) and high sensitivity (76%) for malignancy. Fischer et al. [3] included 55 HCCs, 28 FNHs, and 24 HCAs to identify the key MRI features that can potentially be used to differentiate between HCC and benign hepatocellular tumors in the non-cirrhotic liver. Multivariate analysis revealed T1-hypointensity, T2-hypo-or hyperintensity, lack of central tumor-enhancement, presence of satellite-lesions, and lack of liver-specific contrast media uptake were independent MRI features indicating HCC. Kitao et al. [1] tried to identify points useful in the imaging differentiation of HCC, showing hyperintensity on the HBP of gadoxetic acid-enhanced MRI and FNH and FNH-like nodules. The CT and MRI features of 51 HCCs, 10 FNHs, and 16 FNH-like nodules were analyzed. Multivariate logistic regression analysis showed that arterial phase enhancement and washout pattern at dynamic CT and decrease of ADC ratio would be important findings for the diagnosis of hyperintense HCC differentiated from FNH and FNH-like nodule. In the present study, a clinical factors model was developed combining clinical data with subjective CT features by using multivariate logistic regression analysis, and age, HBV infection, and enhancement pattern were found as significant predictors for Fig. 2 The radiomics nomogram, combining age, HBV infection, enhancement pattern, and Radiomics score (Rad-score), developed in the training set. Enhancement pattern 1, 2, and 3 represented early enhancement with a washout pattern, early enhancement with no washout pattern, and other enhancement patterns, respectively  (33/37) Note: CI (Confidence interval); * Numbers in parentheses were used to calculate percentages differential diagnosis. By using this clinical factors model, high AUC (0.799 in training set; 0.769 in the validation set) for differential diagnosis of HCC from FNH were achieved.
Radiomics enables the noninvasive profiling of tumor heterogeneity by extracting high throughput of quantitative descriptors from routinely acquired CT and MRI studies. Previous investigations have shown that CT/ MRI-based radiomics can be used for differentiating several hypervascular liver tumours. Raman et al. [41] demonstrated that CT texture analysis could be used to distinguish different hypervascular liver lesions using a random-forest model. Seventeen FNHs, 19 HCAs, 25 HCCs, and 19 cases of normal liver parenchyma were analyzed, and the texture model successfully distinguished the three lesion types and normal liver with predicted classification performance accuracy for 91.2% for HCA, 94.4% for FNH, and 98.6% for HCC. Wu et al. [37] developed and validated an MRI-based radiomics signature to distinguish HCC and HH using four feature classifiers, and found that the logistic regression classifier showed better predictive ability, achieving an AUC of 0.89 for differentiating HCC from HH. Stocker et al. [38] enrolled 55 cases of HCC and 45 cases of benign FNHs) in the non-cirrhotic liver to assess the accuracy of MRI texture features in differentiating benign from malignant liver tumours. One gray-level histogram (skewness) and four run-length matrix features extracted from AP images were regarded as the significant texture predictors aiding distinguishing HCC from benign hepatocellular tumors in the non-cirrhotic liver with an accuracy of 84.5% and an AUC of 0.92. Cannella et al. [43] investigated the texture and subjective MRI features of 32 FHNs and 51 HCAs and found that MRI TA parameters combined with hypointensity on HBP imaging yielded an AUC of 0.979 and an accuracy of 96.4% for the diagnosis of HCA. A similar CT-based texture model was built by Cannella et al. [42] for the distinction of HCA and FNH. The mean, mpp, and entropy of medium-level and coarse-level filtered images on AP were found as independent predictors for the diagnosis of HCA and the model based on all these parameters resulted in the largest AUC of 0.824. In this study, a radiomics nomogram was constructed by combining age, HBV infection, enhancement pattern, and Rad-score. The multiple logistic regression analysis showed that the Rad-score made a major contribution to differential diagnosis. In these independent clinical predictors, age provided much more weightage than enhancement Fig. 4 The radiomics nomogram-defined scores for each patient in the training and validation sets. Orange bars represent the scores for HCC patients, while green bars represent the scores for FNH patients pattern. The result was consistent with previous studies that HCC occurred more frequently in older patients compared with FNH [4,7,45]. Although their enhancement patterns are significantly different, the two entities share overlapping enhancement features [6]. The enhancement pattern has a limited impact on the distinction between HCC and FNH in the non-cirrhotic liver.
Compared to the above radiomics investigations on discrimination of different hepatic tumours, our study had several improvements. First, we chose to focus on the distinction of FNH and HCC in the non-cirrhotic liver, because these tumours are the most difficult to differentiate in routine clinical practice and are often the cause of diagnostic errors. Second, previous studies were mainly based on texture analysis associated with only dozens of texture features. Nowadays, radiomics with much more statistic features are available to provide a more comprehensive description of the tumour. In this study, a total of 4227 radiomics features were extracted and analyzed from the triphasic CT images, and finally, 10 features were selected as the significant predictors to construct the radiomics signature. All the selected features were high-order filter and wavelet features that could not be acquired by using conventional texture analysis. Furthermore, both AP, PVP, and DP CT images were used for feature selection, and 5/10 of selected features were obtained from DP images, indicating a trend toward better lesion classification with DP images for FNH and HCC. In addition, FNH is not associated with any malignant potential, and most lesions are managed conservatively. The FNH confirmed with surgical pathology only accounts for a small portion of the whole cohort. The cases of FNH enrolled in the present study were relatively more than those in previous studies.
We acknowledge the following limitations in our study. First, because of its retrospective character, potential selection bias may hamper the reproducibility and comparability of the results. Thus, the clinical usefulness of this nomogram still needs improvement and independent validation in further studies. Second, this study was a single-center experience limited to our institute, multi-center studies for further validation of its reproducibility with a larger sample are required. Third, the two-dimensional largest tumorous ROIs were delineated for the extraction of radiomics features. Whole tumour analysis appears more indicative of tumour heterogeneity than the largest cross-sectional area [46]. In addition, manual ROI segmentation is time-consuming and complicated, especially for the tumour without a welldefined boundary, the automatic segmentation technique with favorable reliability and reproducibility is needed. Fourth, it is reported that slice thickness can affect the diagnostic performance of radiomics signature, and the thin slice may be more informative [47]. A slice thickness of 5 mm was used in this study, which is usually thick for CT radiomics analysis. The difference of the performance in radiomics analysis between the thin and thick slice thickness images will be assessed in our future study.

Conclusions
In conclusion, the CT-based radiomics nomogram developed and validated for preoperative differentiation of FNH from HCC in the non-cirrhotic liver can potentially supplement conventional imaging modalities. However, the clinical use of this tool remains to be tested.