Outcomes assessment in intrahepatic cholangiocarcinoma using qualitative and quantitative imaging features

Background To assess the performance of imaging features, including radiomics texture features, in predicting histopathologic tumor grade, AJCC stage, and outcomes [time to recurrence (TTR) and overall survival (OS)] in patients with intrahepatic cholangiocarcinoma (ICC). Methods Seventy-three patients (26 M/47F, mean age 63y) with pre-operative imaging (CT, n = 37; MRI, n = 21; CT and MRI, n = 15] within 6 months of resection were included in this retrospective study. Qualitative imaging traits were assessed by 2 observers. A 3rd observer measured tumor apparent diffusion coefficient (ADC), enhancement ratios (ERs), and Haralick texture features. Blood biomarkers and imaging features were compared with histopathology (tumor grade and AJCC stage) and outcomes (TTR and OS) using log-rank, generalized Wilcoxon, Cox proportional hazards regression, and Fisher exact tests. Results Median TTR and OS were 53.9 and 79.7 months. ICC recurred in 64.4% (47/73) of patients and 46.6% (34/73) of patients died. There was fair accuracy for some qualitative imaging features in the prediction of worse tumor grade (maximal AUC of 0.68 for biliary obstruction on MRI, p = 0.032, observer 1) and higher AJCC stage (maximal AUC of 0.73 for biliary obstruction on CT, p = 0.002, observer 2; and AUC of 0.73 for vascular involvement on MRI, p = 0.01, observer 2). Cox proportional hazards regression analysis showed that CA 19–9 [hazard ratio (HR) 2.44/95% confidence interval (CI) 1.31–4.57/p = 0.005)] and tumor size on imaging (HR 1.13/95% CI 1.04–1.22/p = 0.003) were significant predictors of TTR, while CA 19–9 (HR 4.08/95% CI 1.75–9.56, p = 0.001) and presence of metastatic lymph nodes at histopathology (HR 2.86/95% CI 1.35–6.07/p = 0.006) were significant predictors of OS. On multivariable analysis, satellite lesions on CT (HR 2.79/95%CI 1.01–7.15/p = 0.032, observer 2), vascular involvement on MRI (HR 0.10/95% CI 0.01–0.85/p = 0.032, observer 1), and texture feature MRI variance (HR 0.55/95% CI 0.31–0.97, p = 0.040) predicted TTR once adjusted for the independent predictors CA 19–9 and tumor size on imaging. Several qualitative and quantitative features demonstrated associations with TTR, OS, and AJCC stage at univariable analysis (range: HR 0.35–19; p < 0.001–0.045), however none were predictive of OS at multivariable analysis when adjusted for CA 19–9 and metastatic lymph nodes (p > 0.088). Conclusions There was reasonable accuracy in predicting tumor grade and higher AJCC stage in ICC utilizing certain qualitative and quantitative imaging traits. Serum CA 19–9, tumor size, presence of metastatic lymph nodes, and qualitative imaging traits of satellite lesions and vascular involvement are predictors of patient outcomes, along with a promising predictive ability of certain quantitative texture features.


Background
Mass-forming intrahepatic cholangiocarcinoma (ICC), the most common subtype of ICC (followed by periductal infiltrating and intraductal growth subtypes), is an epithelial malignancy of the intrahepatic bile ducts that is typically associated with poor patient outcomes; as less than 40% of patients with resectable ICC survive more than 5 years, and those with unresectable disease typically survive less than 12 months [1][2][3][4]. Although the incidence of ICC is highest in Asia, a rise in known risk factors (such as chronic viral hepatitis, cirrhosis, primary sclerosing cholangitis, fibropolycystic liver disease, and recurrent pyogenic cholangitis) has led to a worldwide rise in its incidence and mortality over the past two decades [1,[5][6][7][8]. In the United States (US), the reported average incidence of ICC has increased from 0.44 to 1.18 cases per 100,000, representing an annual percentage change of 2.3% between 1973 and 2012 [9].
Despite liver resection followed by adjuvant chemoradiation therapy being the most effective treatment, documented postoperative recurrence rates reach as high as 53 to 79%, and most patients die of their disease [10][11][12]. These dismal facts highlight the need for improved noninvasive tumor characterization and enhanced risk stratification in an effort to better predict clinical outcomes and augment perioperative management, including initiating adjuvant chemotherapy. Histopathologic findings of tumor size, tumor grade, intrahepatic metastasis, vascular invasion, and lymph node metastasis have been established as poor independent prognostic factors in ICC [2,13].
Recent studies have investigated the role of crosssectional imaging for the characterization of ICC pathology and outcomes. The degree of enhancement on delayed phase computed tomography (CT) was shown to correlate with the amount of fibrous stroma and the frequency of perineural invasion, both of which are poor independent prognostic indicators [14]. Conversely, arterial enhancement of ICC on CT has been shown to be an independent predictor of improved survival [15]. In a study with emphasis on diffusion-weighted imaging (DWI) measured with magnetic resonance imaging (MRI), the authors demonstrated that ICCs with > 1/3 diffusion restriction had more favorable histopathologic features and better clinical outcomes compared to those with < 1/3 diffusion restriction, as less diffusion restriction is believed to correlate with more fibrous stroma [16]. With regards to the apparent diffusion coefficient (ADC) quantification, it has been suggested that the ADCmean of ICC is significantly lower than that of the adjacent liver parenchyma, and that poorly differentiated tumors demonstrated a significantly lower ADCmean than well or moderately differentiated tumors [17,18]. Lastly, it is recognized that F-18 FDG PET/CT is an important diagnostic tool in staging of ICC. Recent studies have described SUV (standardized uptake value) quantification as a significant discriminant parameter for predicting poorer outcomes [19,20].
Radiomics is a process by which one can extract quantitative data containing valuable information about pathophysiology from digital medical images [21]. There is very limited data assessing the role of radiomics in ICC. To the best of our knowledge, only one published study has shown an association between texture features based on CT and expression of tumor markers of hypoxia in ICC [22].
While all these reports acknowledging the imaging characterization of ICC are promising, work assessing the relationship between imaging parameters and clinical outcomes is lacking. The main objectives of our study were to assess the diagnostic performance of imaging features, including quantitative radiomics texture features, in determining histopathologic tumor grade, AJCC stage, and in predicting outcomes [time to recurrence (TTR) and overall survival (OS)] in comparison with multiple clinicopathologic and demographic variables in patients with ICC.

Patients
This retrospective single-center study was approved by the local institutional review board at the Icahn School of Medicine at Mount Sinai (ISMMS), New York, NY, with exemption for patient consent. The ISMMS Department of Surgery electronic database was queried between August 2003 and January 2017 using the search term "cholangiocarcinoma" and "CT" and/or "MRI". Inclusion criteria were: 1) patients with pathologically proven mass-forming ICC, 2) patients who underwent preoperative multiphasic CT, MRI, or both within 6 months prior to resection (segmentectomy or partial hepatectomy), 3) lesion size ≥1 cm, and 4) no interval therapy between imaging and surgery. Exclusion criteria were: 1) lesion size < 1 cm, 2) patients who had undergone prior locoregional or systemic treatment for their malignancy, 3) mixed ICC/hepatocellular carcinoma (HCC) histology, and 4) patients with technically inadequate imaging studies. Of the initially included patients (n = 98), twenty-five patients were excluded as follows: lesion size < 1 cm (n = 4), prior treatment (n = 8), mixed ICC/HCC tumor pathology (n = 8), and technically inadequate imaging (n = 5). The final study population comprised 73 patients (26 M/47F; mean age 63 ± 11.4 years; range 24-81 years). The study flow chart is shown in Fig. 1.
The following clinical data was recorded for each patient at the time of preoperative imaging after interrogating the medical records: age, gender, race/ethnicity, serum CA 19-9, and presence and etiology of underlying chronic liver disease, presence/absence of tumor recurrence, tumor recurrence date, and date of death or date of last follow up.

Image acquisition
Multiphasic CT and/or MRI were performed using a variety of clinically available imaging platforms, as outside institutional imaging studies comprised some of our preoperative study population. On CT, these included GE Medical Systems, Siemens and Philips scanners. Several multichannel MRI systems were used for scanning, including 1.5 T (Avanto, Aera, Sonata, and Symphony, Siemens Healthineers; and Signa HD, HDxt, Optima 450w, GE Medical Systems) (n = 32) or 3 T (Skyra, Siemens Healthineers, 750, GE Medical Systems) (n = 4) imaging platforms.
The sequences and acquisition parameters varied slightly between different imaging platforms, however arterial phase (AP) images were obtained 20-40 s after iodinated (CT) or gadolinium-based (MRI) intravenous contrast administration and portal venous phase (PVP) images were obtained 60-100 s after contrast administration. Twelve MRI exams were performed with a liver specific gadolinium based contrast agent (gadoxetic acid, Eovist/Primovist, Bayer Healthcare); in these cases, equilibrium (EP)/transitional phase (TP) images were obtained after 3-6 min of contrast administration, and hepatobiliary phase (HBP) images were obtained 10 to 20 min after contrast administration. Extracellular contrast agents used in the remaining cases included gadobutrol (Gadavist, Bayer Healthcare) and gadopentetate dimeglumine (Magnevist, Bracco Diagnostics). Diffusion weighted imaging (DWI) was available in 19 patients, with b-values ranging from 50 to 1000 s/mm 2 . ADC maps were generated automatically by the scanner.

Qualitative image analysis
For qualitative analysis, two fellowship-trained, boardcertified abdominal radiologists (observer 1, SL; and observer 2, KL, with 8 and 13 years of experience in abdominal imaging at the time of the study, respectively) independently reviewed the CT and MR images using PACS (Centricity 3.0, General Electric Medical Systems). The reviewers were aware that the patients had ICC, however were unaware of any other clinicopathologic information. The index lesion, identified as the largest lesion on a single axial image and selected by both observers in consensus, was used for qualitative and quantitative analysis and for correlation with pathology findings and outcomes. In patients with multifocal ICC (n = 4), the single largest lesion was analyzed; multifocal ICC was defined as the presence of at least one additional tumor nodule greater than 2 cm away from the index lesion.
The observers recorded the segmental location of the index lesion on PVP, as well as the presence/absence of ancillary findings including liver capsule bulging or retraction (unequivocal outward or inward liver contour change immediately superficial to an ICC lesion, respectively), vascular involvement (the presence of obvious enhancing tumor thrombus within the portal and/or hepatic veins, vascular encasement or distortion), peripheral biliary ductal dilatation, satellite lesions (small tumor nodules within 2 cm of the index lesion), and presence of additional non-satellite lesions (Fig. 2). Presence or absence of liver cirrhosis based on established morphologic criteria was recorded [23].
Dynamic enhancement patterns on CT and MRI were classified into 2 categories to allow for adequate statistical analysis: peripheral progressive whole-lesion (progressive whole-lesion enhancement starting from its periphery over time) + persistent rim enhancement, or other (includes wash in/wash out, solid whole lesion progressive (progressive whole-lesion enhancement over time), hypovascular, and necrotic). ICCs were categorized on T2-weighted imaging as hyperintense to adjacent liver parenchyma, targetoid (T2 hyperintense peripheral cellular region with a more T2 hypointense central core), or other (includes isointense and heterogeneous). ICCs were categorized on DWI sequences (when available) as hyperintense to adjacent liver parenchyma, targetoid (DWI hyperintense peripheral cellular region with a more DWI hypointense central core), or other (includes isointense and heterogeneous). Lesions were evaluated on ADC maps as hypointense or other (includes isointense, hyperintense, inverse targetoid, and heterogeneous). For cases performed with gadoxetic acid, lesions were assessed on the T1-weighted HBP as: hypointense compared to surrounding liver parenchyma or other (includes isointense, hyperintense or targetoid).

Quantitative image analysis
The observers recorded the maximum lesion size of the index lesion on PVP. Additional quantitative image analysis was performed utilizing regions of interest (ROIs) drawn on index lesions on all applicable phases of enhancement and the pre-contrast phase, as well as on non-tumoral liver parenchyma on the PVP by Observer 3 (MK, a fourth-year radiology resident with 1 year of experience in abdominal MRI at the time of the study). A single ROI was drawn to include as much of the index lesion as possible on the axial slice designated by the observers on each phase. In all cases, the ROI diameter was > 1 cm.
ROIs were drawn on the Osirix DICOM viewer (v5.5.2, Pixmeo, Bernex, Switzerland). ROI data were subsequently analyzed with custom written scripts using MATLAB (vR2016b, Mathworks Inc., Natick, MA). Lesion enhancement ratios (ERs) were calculated for arterial and portal venous phases for CT exams, and Ers were calculated for MRI exams for AP, PVP and equilibrium/ transitional phases, as well as HBP (when available) as follows: Lesion ADCmean and ADCmin values were calculated in 19 patients using monoexponential fitting of the signal intensity (SI) decay curve with the following formula using two b-values: ADC = ln (S2/S1)/(b1-b2), where S1 and S2 are the SI at b-values b1 = 50 s/mm 2 and b2 = 400-500 s/mm 2 , respectively; these b-values were selected because they were the most common combinations among the different DWI protocols.
Multiple second order Haralick texture features-energy, contrast, correlation, variance, homogeneity, sum average, sum variance, sum entropy, entropy, difference variance, difference entropy, information measure of correlation 1, information measure of correlation 2, and maximal correlation coefficient-were extracted from signal values in the ROIs on PVP images also utilizing MATLAB software for both CT and MRI by observer 4 (SH, an MR physicist with 6 years of experience at the time of the study) in consensus with observer 3 [21,[24][25][26][27][28]. Before texture analysis, SI values in the ROIs were normalized to a range within three standard deviations of the mean SI of the ROI and decimated to 64 discrete bin values. The PVP was selected for texture analysis to allow for adequate lesion conspicuity and for consistency, as PVP images were performed in all MRI and CT cases. Because data from different imaging vendors, platforms, and protocols was included, data preprocessing using normalization was performed to reduce the signal variation between acquisitions [29].

Study endpoints Histopathologic analysis
Pathologic tumor grade (defined as G1: well differentiated; G2: moderately differentiated; G3: poorly differentiated) and AJCC tumor stage (8th edition) were extracted from pathology reports from the electronic medical record [30][31][32]. When a single tumor contained regions of different degrees of differentiation, the lesion was classified based on the worse degree of tumor differentiation. Presence/absence of vascular invasion and presence/absence of nodal metastasis within regional lymph nodes submitted with all surgical specimens were also recorded.

Patient outcomes
Our study endpoint of time to recurrence (TTR) was defined as the time between surgical resection and the development of locoregional or distant tumor recurrence. Overall survival (OS) was calculated as the time between surgical resection and the date of death (from any cause) or the date of last clinical or imaging follow-up. A final review of the patient's medical records was undertaken in September 2018.

Statistical analysis
Data acquired from each imaging modality (CT or MRI) was analyzed separately. For the purpose of statistical analysis, patients with multifocal ICC (n = 4) were included in the group designated as positive for satellite lesions.
Logistic regression was used to assess the utility of demographic, clinical, and imaging factors, alone and in combination, as predictors of tumor grade and AJCC stage, and was quantified in terms of area under the ROC curve. In order to perform logistic regression analysis, AJCC stage was analyzed as binary variable as stage I-II vs. III.
The association of each clinical, demographic, qualitative and quantitative imaging factor with OS and TTR was assessed using log-rank and generalized Wilcoxon tests. Survival curves and the median and inter-quartile range of OS and TTR were derived using the Kaplan-Meier product-limit estimator. Cox proportional hazards regression was used to estimate the hazard ratio (HR) of individual factors as predictors of each survival outcome and to assess the effects of feature combinations for the prediction of each outcome. Only variables observed to be significant predictors of at least one outcome according to at least one of the univariable log-rank and Wilcoxon tests were entered in the multivariable analyses. The Fisher exact test was used to assess the association of each qualitative imaging trait from each observer with each binary outcome.
Stepwise variable selection in the context of logistic and Cox proportional hazards regression was then used to identify subsets of variables representing significant independent predictors of each binary and survival outcome, respectively. Inter-observer agreement in terms of the qualitative imaging traits was assessed using the simple kappa (K) coefficient. All statistical tests were conducted at the two-sided 5% significance level using SAS 9.4 (SAS Institute, Cary, NC).

Results
Demographic, clinical, histopathologic and outcomes findings

Qualitative image analysis
The 73 index lesions were identified in consensus for qualitative analysis, and the qualitative imaging traits assessed are summarized in Table 3. The observers provided an individual independent assessment of each lesion characteristic for each modality, resulting in 52 CT and 36 MRI exams being evaluated in total. They demonstrated moderate to perfect agreement for most Only biliary obstruction on MRI was associated with poor tumor differentiation (p = 0.032; observer 1). For prediction of AJCC stage, biliary obstruction on CT was associated with higher stage disease (stage I-II vs. stage III) for both observers (p = 0.006 and p = 0.002, respectively), as was vascular involvement on CT (p = 0.043 and p = 0.009). The presence of satellite lesions on CT (p = 0.022; observer 1) and vascular involvement on MRI (p = 0.005; observer 2) were significant for one observer. Based on univariable analysis, these features were then entered into a logistic regression model, yielding fair accuracy for prediction of worse tumor grade (maximal AUC of 0.68 for biliary obstruction at MRI, p = 0.032, observer 1) and higher AJCC stage (maximal AUC of 0.73 for biliary obstruction on CT, p = 0.002, observer 2; and AUC 0.73 of 0.73 for vascular involvement on MRI, p = 0.01, observer 2). The results from logistic regression analysis of qualitative imaging features as predictors of pathologic grade and tumor stage are listed in Table 4.

Quantitative image analysis
There was no quantitative measurement that was pre- In the univariable analysis, ADCmean (p = 0.042) was found to be an independent predictor of TTR on the log-rank test (p = 0.042), but was not found to be significant on subsequent Cox proportional hazards regression (p = 0.571). ADCmin was found to be an independent predictor of OS on the log-rank (p = 0.005) and generalized Wilcoxon tests (p = 0.015), but was not found to be statistically significant on subsequent Cox proportional hazards regression (p = 1.0), likely related to sample size (n = 19). For enhancement ratios (ERs), only arterial phase (AP) ER on MRI was associated with TTR (HR Table 2 Association of demographic, laboratory and pathologic features with time to recurrence (TTR) and overall survival (OS) and the hazard ratios (HR) from Cox regression to characterize the effect of each feature on outcomes in 73 patients with ICC  (Fig. 3). The only texture feature associated with OS was MRI information measure of correlation 1 (HR 1.87 [0.97-3.62], p = 0.038) (Fig. 4). Results summarizing associations of quantitative imaging features with outcomes on univariable analysis are shown in Table 6.
Results of the multivariable analysis demonstrated that after adjusting for CA 19-9 and tumor size, satellite lesions on CT (HR 2.79 [1.09-7.15], p = 0.032) for observer 2, vascular involvement on MRI for observer 1 (HR 0.10 [0.01-0.85], p = 0.035), and MRI variance (HR 0.55 [0.31-0.97], p = 0.040) were predictive of TTR. No qualitative or quantitative feature was predictive for OS when adjusted for CA 19-9 and lymph nodes (all p-values > 0.088). There was no set of two or more imaging measures that were significant independent predictors of tumor grade, AJCC stage, TTR, or OS after adjusting for the competing risk factors identified for that outcome.

Discussion
In this study, we tested qualitative and quantitative imaging data obtained from pre-operative CT and/or MRI as well as CA 19-9 with pathology and outcomes in 73 patients with ICC. Our median TTR of 53.9 months (IQR 73.2 months, range 1.6-99 months) and median OS of 79.7 months (IQR 75.4 months, range 1.8-137.3 months) are longer than most published reports (median TTR and median OS have previously been reported as ranging from 7 to 34 months and 21.8-49 months, respectively) [1,10,33,34]. Nevertheless, our findings of elevated serum CA 19-9, histopathologic vascular invasion, metastatic lymph nodes, AJCC tumor stage, and tumor size (measured at imaging) as significant predictors of TTR and OS agree with the literature [1,2,13].
We demonstrated fair accuracy for prediction of higher AJCC stage (maximal AUC of 0.73 for biliary obstruction on CT, p = 0.002, observer 2 and AUC 0.73 of 0.73 for vascular involvement on MRI, p = 0.01, observer 2) and tumor grade (maximal AUC of 0.68 for biliary obstruction on MRI, p = 0.032, observer 1) utilizing qualitative and quantitative image analysis. After adjusting for competing risk factors using multivariable analysis, we found that the presence of satellite lesions on CT (HR 2.79 [1.09-7.15], p = 0.032, observer 2), vascular involvement on MRI (HR 0.10 [0.01-0.85], p = 0.035, observer 1), and the texture feature MRI variance (HR 0.55 [0.31-0.97], p = 0.040) remained predictors of TTR. Several quantitative imaging features, including some Haralick texture features in addition to other qualitative imaging traits, were significant predictors of TTR and OS in univariable analysis, but were not confirmed at multivariable analysis. We believe these results are promising, especially as data regarding texture analysis for noninvasive characterization of ICC and clinical outcomes using cross-sectional imaging is limited [22]. Our findings corroborate and expand upon previous studies that have investigated the imaging features of ICC and the correlations between radiologic and pathologic findings [2,13,16]. The imaging features of ICC correlate with specific histopathologic features: intrahepatic biliary dilatation reflects the tumor's origin from the biliary duct; peripheral enhancement represents viable tumor cells, with delayed enhancement of a central fibrous/scirrhous stroma composed of desmoplastic tissue occurring later in time; and the presence of satellite nodules indicates the tumor's proclivity to invade small portal vessels and along portal triads [8]. Previous work has identified a prognostic implication of the delayed phase enhancement on CT, with greater degree of enhancement representing a greater quantity of fibrous stroma and perineural invasion, correlating with poorer outcomes [14]. The presence of satellite nodules, macrovascular invasion, and portal venous or delayed phase enhancement has been previously described as poor prognostic indicators [2,8,11,13,14,35,36]. Of note, some qualitative features were statistically significant for only one of the two observers, likely due to our limited sample size; as a result, features there were not significant for both observers may therefore not be helpful in predicting outcomes. ADCmean has been shown to be significantly lower than that of the adjacent liver parenchyma in ICC (with poorly differentiated tumors demonstrating a significantly lower ADCmean); our findings of ADCmean as an independent predictor of TTR on the log-rank test (p = 0.042), and ADCmin as an independent predictor of OS on the univariable log-rank (p = 0.005) and generalized Wilcoxon (p = 0.015) tests support the potential value of DWI in the imaging workup of ICC, while notably, the lack of significant results using other statistical tests could be explained by small sample size [17,18]. Radiomics quantification including histogram quantification and Haralick texture analysis, a mathematical method that generates various quantitative parameters characterizing the spatial variation of gray levels throughout an image, has shown correlations between calculated texture features and histopathologic characteristics, genomic data, and clinical outcomes in various tumor types [24,27,28,[37][38][39]. Texture analysis is sensitive to subtle changes in tumor morphology that may not be detected visually. Intra-tumoral changes due to neovascularity, tumor necrosis, and aggressive growth patterns within ICC contribute to heterogeneity, which may be quantified using texture analysis. As expected in our study, the texture features found to be significant on MRI were different than those found to be significant on CT without redundancy or overlap, reflecting the innate differences between both modalities. The MRI texture feature variance was the only texture feature to remain a significant predictor of outcomes on multivariable analysis; this may suggest that MRI texture features are potentially more valuable than CT texture features in characterizing ICC, possibly due to the greater soft tissue contrast resolution in MRI.
These specific texture features in our study have also shown significant results for other tumor types. For example, the entropy feature, which is a measure of disorder in the distribution of signal intensities in the ROI and is thought to be a manifestation of tumor heterogeneity, has been previously shown to predict tumor recurrence, disease free survival and OS for hepatocellular carcinoma (HCC) [40,41]. In a recent study of 25 patients with ICC with biopsy, significant correlations between certain grey level co-occurrence matrix textures features based on CT and immunohistochemical markers of hypoxia were identified [22]. Specifically, the entropy texture feature was significantly associated with EGFR expression (R 2 = 0.17, p < 0.05). The authors also found that the correlation texture feature was associated with VEGF expression (R 2 = 0.23, p < 0.05) and EFGR expression (R 2 = 0.21, p < 0.05); in contrast, we did not find any associations between the correlation texture feature on either CT or MRI and pathologic markers or outcome [22]. While previous studies including ours have described associations between imaging texture features and pathologic features and clinical outcomes, establishment of meaningful biologic correlates for specific texture features remains under active investigation.
Our methods could be clinically applicable and relevant, especially as this type of analysis can be performed using a standard clinical CT or MRI protocol for the initial preoperative assessment of liver tumors in efforts to predict tumor type, tumor grade, tumor stage, and outcomes. Imaging features can be useful to predict TTR so that more aggressive neoadjuvant and/or locoregional therapies-including chemotherapy-and postoperative surveillance can be instituted. Furthermore, integration of clinical variables, especially serum CA 19-9, in conjunction with qualitative and quantitative imaging data may potentially yield the best predictive accuracy for the non-invasive assessment of ICCs. Regarding texture analysis, there does need to be standardization of sequences, protocols, and radiomics analysis to enable widespread application of this technique.
We recognize several limitations to our study. Several features in our qualitative analysis were combined in order to provide enough statistical power; expansion of our sample size in subsequent work would allow for a more robust statistical analysis. There was variability in the CT and MRI acquisition techniques as these exams were performed on a variety of clinical scanners over the duration of the long study period. A window of obtaining preoperative imaging up to 6 months prior to surgery may have introduced bias and affected our results as the tumor could have developed more aggressive features by the time surgery was performed. Only one observer was included for the quantitative imaging analysis; having two independent observers would have allowed for assessment of inter-observer reproducibility of quantitative assessments. Assessment of DWI/ADC was limited as there were only 19 cases where DWI/ADC was available, and variation in image acquisition technique can influence ADC measurement. We sought to minimize this bias by analyzing DWI acquired with the same b-values (b50 and b400-500 s/mm 2 ), however more focused work on DWI/ADC with less potential for bias is needed to assess its true ability in predicting outcomes. Despite differences in texture-based discrimination existing between 1.5 T and 3 T MRI due to varying SNR and artifacts at different field strengths, our relatively low number of 3 T (n = 4) as compared with 1.5 T (n = 32) scans, as well as our efforts to normalize the texture data, minimizes this limitation. It is possible that our limited sample size and variability due to scanners and protocols used in this retrospective study may explain why combinations of blood biomarkers and imaging features did not yield significant results at multivariable analysis. Finally, our study lacks a validation cohort, which may be assessed in future studies.

Conclusion
In conclusion, our study demonstrated reasonable accuracy for the prediction of tumor grade and higher AJCC stage in ICC utilizing certain qualitative and quantitative imaging traits. Serum CA 19-9, imaging tumor size, presence of metastatic lymph nodes, and qualitative imaging traits of satellite lesions and vascular involvement are predictors of patient outcomes, along with a promising predictive ability of certain quantitative texture