A transformer-based multi-task deep learning model for simultaneous infiltrated brain area identification and segmentation of gliomas

Li, Yin; Zheng, Kaiyi; Li, Shuang; Yi, Yongju; Li, Min; Ren, Yufan; Guo, Congyue; Zhong, Liming; Yang, Wei; Li, Xinming; Yao, Lin

doi:10.1186/s40644-023-00615-1

Research article
Open access
Published: 27 October 2023

A transformer-based multi-task deep learning model for simultaneous infiltrated brain area identification and segmentation of gliomas

Yin Li¹^na1,
Kaiyi Zheng^2,3^na1,
Shuang Li⁴^na1,
Yongju Yi¹,
Min Li⁵,
Yufan Ren⁵,
Congyue Guo⁵,
Liming Zhong^2,3,
Wei Yang^2,3,
Xinming Li⁵ &
…
Lin Yao⁴

Cancer Imaging volume 23, Article number: 105 (2023) Cite this article

1103 Accesses
Metrics details

Abstract

Background

The anatomical infiltrated brain area and the boundaries of gliomas have a significant impact on clinical decision making and available treatment options. Identifying glioma-infiltrated brain areas and delineating the tumor manually is a laborious and time-intensive process. Previous deep learning-based studies have mainly been focused on automatic tumor segmentation or predicting genetic/histological features. However, few studies have specifically addressed the identification of infiltrated brain areas. To bridge this gap, we aim to develop a model that can simultaneously identify infiltrated brain areas and perform accurate segmentation of gliomas.

Methods

We have developed a transformer-based multi-task deep learning model that can perform two tasks simultaneously: identifying infiltrated brain areas segmentation of gliomas. The multi-task model leverages shaped location and boundary information to enhance the performance of both tasks. Our retrospective study involved 354 glioma patients (grades II-IV) with single or multiple brain area infiltrations, which were divided into training (N = 270), validation (N = 30), and independent test (N = 54) sets. We evaluated the predictive performance using the area under the receiver operating characteristic curve (AUC) and Dice scores.

Results

Our multi-task model achieved impressive results in the independent test set, with an AUC of 94.95% (95% CI, 91.78–97.58), a sensitivity of 87.67%, a specificity of 87.31%, and accuracy of 87.41%. Specifically, for grade II-IV glioma, the model achieved AUCs of 95.25% (95% CI, 91.09–98.23, 84.38% sensitivity, 89.04% specificity, 87.62% accuracy), 98.26% (95% CI, 95.22–100, 93.75% sensitivity, 98.15% specificity, 97.14% accuracy), and 93.83% (95%CI, 86.57–99.12, 92.00% sensitivity, 85.71% specificity, 87.37% accuracy) respectively for the identification of infiltrated brain areas. Moreover, our model achieved a mean Dice score of 87.60% for the whole tumor segmentation.

Conclusions

Experimental results show that our multi-task model achieved superior performance and outperformed the state-of-the-art methods. The impressive performance demonstrates the potential of our work as an innovative solution for identifying tumor-infiltrated brain areas and suggests that it can be a practical tool for supporting clinical decision making.

Introduction

Gliomas are a type of malignant tumor that develops in the glial cells of the brain and spinal cord, and they are among the most lethal neurological malignancies [1, 2]. The World Health Organization (WHO) has classified glioma tumors into four grades according to their level of aggressiveness [3, 4]. Although advanced therapies have been developed for glioma treatment, neurological surgery remains the primary treatment modality to the survival rate of patients. The shape and size of gliomas can vary significantly based on their location in the brain and their growth rate. Besides, the definition of glioma boundaries is typically dependent on the on the expertise of the neuroradiologists. Manual segmentation and identification of glioma-infiltrated brain areas are extremely tedious and time-consuming. Therefore, there is a significant clinical need for automatic segmentation and identification of gliomas-infiltrated brain areas to aid in clinical decision-making, treatment planning, and ongoing tumor monitoring.

Magnetic resonance imaging (MRI) is a highly promising imaging technique due to its non-invasive nature. T2-fluid-attenuated inversion recovery (T2-FLAIR) abnormality is a reliable indicator of tumor progression, which has been found to correlate with impoved survival rate [5, 6]. Numerous innovative studies have proposed automatic segmentation the whole tumor using T2- FLAIR for volumetric measurement, radiomics, or radiogenomics [7, 8]. Furthermore, multi-task convolutional neural networks (CNNs) have been extensively proposed for tumor segmentation, while simultaneously addressing tasks such as IDH genotyping [9, 10], grading [11], molecular subtyping [11, 12], and detection of enhancing tumors [13]. For example, van der Voort et al. [11] developed a multi-task CNN (referred to as COM- Net for convenience) to predict the IDH mutation status, the 1p/19q co-deletion status, and the grade of a tumor, while simultaneously segmenting the tumor. Cheng et al. [10] proposed a transformer-based multi-task model (MTTU-Net) for glioma segmentation and IDH genotyping. Similar to our purposed method, the hybrid CNN-Transformer encoder is designed to extract shared local and global information for both glioma segmentation and IDH genotyping.

However, most studies have focused on glioma segmentation for clinical applications, with relatively little attention given to automatic identification of the precise anatomical location of glioma within the brain. It should be noted that the anatomical location of a glioma plays a crucial role in determining treatment options, clinical courses, and prognosis [14, 15].

A glioma tumor typically manifest in one or multiple brain areas (as shown in Figure (Fig. 1)), which can be consistently identified in the frontal lobe (F), parietal lobe (P), occipital lobe (O), temporal lobe (T), and insula lobe (I). Specifically, glioma in the insular area is particularly challenging in neurosurgical oncology [16]. Different tumor locations within these areas can have varying effects on tumor growth, symptoms, and treatment strategies, as previous studies have shown correlations between specific types and locations of tumors and clinical outcomes [15, 17]. The accurate identification of infiltrated brain areas holds important value in addressing challenges related to lesion anatomical localization, partial selection of surgical approaches (such as the use of the transcortical approach for frontal lobe lesions), and defining postoperative radiation therapy targets. However, identifying the glioma-infiltrated brain areas remains a challenging task due to the wide variation of tumor appearance. Unlike most multi-class classification tasks, identifying the glioma-infiltrated brain areas is a multi-label classification task, meaning that one subject can have one or more labels. For instance, as shown in Fig. 1, the glioma tumor can be found in a single brain area (Fig. 1(a)) or more than one brain areas (Fig. 1(b) and (c)). Although existing CNN models such as VGG16 [18], ResNet [19], EfficientNet [20], and Inception [21] have achieved significant performance in multi-class classification, accurately identifying the glioma-infiltrated brain areas without knowing the precise boundaries of the gliomas may be challenging for these models.

To address the aforementioned issues, we propose a transformer-based multi-task deep learning model that achieves simultaneous glioma segmentation and identification of infiltrated brain areas in an end-to-end framework. The main contribution of our study is the ability to accomplish these tasks simultaneously by sharing local and global features extracted from a hybrid CNN and Transformer network. Specifically, our multi- task deep learning framework leverages the boundary information from segmentation to enhance the performance of identifying glioma-infiltrated brain areas.

Materials and methods

Patients population

With the approval of institutional ethics committee, patients who received brain tumor resection and were diagnosed with glioma according to the 2016 WHO criteria [3] were retrospectively enrolled, following specific inclusion and exclusion criteria. The patient selection procedure in this study is shown in Fig. 2, and a total of 354 patients were included. The inclusion criteria included patients with glioma confirmed by pathological examination and preoperative MRI data. The exclusion criteria were as follows: (1) incomplete preoperative T2-FLAIR and post-contrast T1-weighted (T1c) MRI, (2) a history of other malignant tumors, and (3) images with severe noise and/or artifacts. Two MRI sequences, including T1c and T2-FLAIR, were acquired on three different types of scanners with two different field strengths, including Siemens 1.5 T, Siemens 3 T, Philips 1.5 T, Philips 3 T and GE 3 T. The in-plane pixel spacings of the T2FLAIR MR images range from 0.34 mm to 0.78 mm with an average of 0.60 mm and slice thicknesses range from 4.0 mm to 6.0 mm with an average of 5.05 mm. The in-plane pixel spacings of the enhanced T1 MR images range from 0.45 mm to 0.94 mm with an average of 0.65 mm and slice thicknesses range from 0.90 mm to 6 mm with an average of 4.85 mm. The images obtained by different scanners are randomly distributed in the training, validation and testing set.

The details of the demographic and clinical characteristics of patients, as well as the whole tumor cohort, are shown in Table 1.

Table 1 Demographic and clinical characteristics of patient cohort

Full size table

Data pre-processing

To reduce the effects of variability in acquisition and sequence parameters, image pre-processing was applied before analysis, which included MRI offset correction, image registration, skull stripping, and gray-level normalization. The N4ITK algorithm [22] was adopted to correct the low-frequency intensity non-uniformity of the magnetic field in the MR images. Rigid registration in FSL [23] and non-rigid registration in ANTs [24] were used to register the T1c and T2-FLAIR images for each patient. Image skull stripping was performed using BET in FSL. Gray-level normalization [25] was applied to adjust the gray values of MR images to compensate the MR scanner variability in intensity. The tumor masks were manually delineated around the tumor outline on 3D T2-FLAIR slices by radiologists with more than 10 years of experience with MRI. These tumor masks were used for training the tumor segmentation model. The glioma- infiltrated brain areas were labeled with 1 ~ 5 for the frontal lobe, parietal lobe, occipital lobe, temporal lobe, and insula lobe, respectively.

Network details

The architecture of our proposed transformer-based multi-task model for simultaneous infiltrated brain area identification and segmentation of gliomas is presented in Fig. 3. The network includes an encoder for learning the task relevant features, and two decoders for extracting task specific information. Besides, to capture the long- range dependencies and global information, a Swin Transformer module with convolutional layers in each head is embedded in the bottleneck of the encoder, which deeply fuses local and global features. The input, which concatenates T1c and T2-FLAIR images, is first fed into an initial convolution with a 3 × 3 × 3 kernel in the encoder. Then the task relevant features extracted from the encoder are separately processed by two decoders; one for performing voxel-level tumor segmentation and the other for identifying glioma-infiltrated brain areas using a classification head.

Data partitioning and network implementation

The performance of the models was evaluated on an independent test set, which was not included in the model training. The remaining data were randomly divided into training and validation sets for parameter selection. After obtaining the optimal model parameters using the training and validation sets, the optimal model parameters were used to train the final model using the training data and validated on the independent test set.

The models were trained using PyTorch v1.7, with a data batch size of 4 and input image dimensions of 416 (height) × 352 (width) × 20 (depth). Gray-level normalization [25] was applied to adjust the gray values of MR images to compensate the MR scanner variability in intensity. Then they were linearly transformed to the range of [-1, 1]. Data augmentation techniques, including random rotation in the range of [-30, 30] and random scaling in the scale of [0.95, 1.05], were applied for model training. For optimization, we used the SGD optimizer with an initial learning rate of 0.0001 and trained the model for 200 epochs. To control the learning rate, we applied the step-decay learning rate control method, reducing the learning rate by 10 at the 100th and 160th epochs, respectively. We trained our models using cross-entropy loss for classification and a combination of cross-entropy loss and dice loss for segmentation. The weight ratios for segmentation were set to 0.6 and 0.4. To ensure a fair comparison, the compared models were trained with the same image size, learning rate, and number of iterations.

In addition, the two demographic characteristics (age and sex) and two clinical data variables (glioma grade and Karnofsky performance score) used in the comparison experiment were also pre-treated. Except sex was labeled with 0 and 1, the other three characteristics was standardized with z-score. Unlike image information, which is encoded using convolution, the binarized gender information is first transformed into one-dimensional features through Word Embedding. Prior to the fully connected layer, the three standardized clinical features (age, grade, and kps), the gender features represented in the embedding space, and the image features are concatenated together and collectively fed into the fully connected layer for the final prediction.

Statistical analysis

All well-trained models were evaluated on both a validation set and an independent test set. The performance of multi-label classification model was assessed using various metrics, including the area under the receiver operating characteristic curve (AUC), accuracy (ACC), sensitivity (SEN), and specificity (SPE). Micro-averaging is used to describe all indexes of multi-label classification. The optimal threshold for the AUC value was determined by maximizing the sum of the sensitivity and specificity values. The 95% confidence intervals (CIs) were obtained using bootstrapping to assess variability. Besides, we used the Dice score (DSC), accuracy, sensitivity, specificity, Intersection over Union (IOU), Average Symmetric Surface Distance (ASSD), and 95th Percentile Hausdorff Distance (HD₉₅) to evaluate the segmentation performance of the segmentation model. The DSC measures the degree of overlap between the ground truth and the predicted segmentation, with 100% indicate complete overlap.

Results

Patient characteristics

By applying specific inclusion and exclusion criteria, a total of 354 patients (195 males and 159 females) with an average age of 47.61 ± 12.99 years (range from 11 ~ 82 years) were included in our study. Of these, 270 were allocated to the training set, 30 to the validation set, and 54 to the independent test set. As shown in Table 1, 243 patients (68.64%) were diagnosed with low-grade gliomas (grade II and III), while 111 patients (31.36%) were diagnosed with high-grade gliomas (grade IV). Besides, a total of 239 gliomas (67.51%) were located within a single anatomic area, while 115 gliomas (32.49%) infiltrated across two or more areas. Specifically, the gliomas in the frontal lobe accounted for 56.78%, those in the parietal lobe accounted for 21.47%, those in the occipital lobe accounted for 11.30%, those in the temporal lobe accounted for 40.11%, and those in the insula lobe accounted for 7.62% of all patients.

Model performance of glioma-infiltrated brain area identification

Tables 2 and 3 and Fig. 4 show the performance of identifying the glioma-infiltrated brain areas. Specially, Table 3 is adopted to show the performance of models for gliomas that infiltrated in single and multiple regions. Our proposed method achieves better performance than VGG16, ResNet50, EfficientNetb0 and COM-Net, MTTU-Net, with AUC of 94.95% (95% CI, 91.78–97.58) on the independent test set. Figure 4 illustrates the Receiver Operating Characteristic (ROC) curves for six distinct deep learning-based classification models, along with an enhanced iteration of our model that incorporates four additional clinical characteristics. These models were evaluated on an independent test set, yielding AUC values of 78.90% (95% CI, 73.07–84.28), 87.10% (95% CI, 81.73–92.04), 85.23% (95% CI, 78.75–90.74), 90.22% (95% CI, 85.12–94.61), 93.51% (95% CI, 90.22–96.37), 94.95% (95% CI, 91.78–97.58), and 95.07% (95% CI, 92.28–97.48) for VGG16, ResNet50, EfficientNetb0, COM-Net, MTTU-Net, our method, and our method with added clinical characteristics, respectively. Tables 2 and 3 also present the results of clinically related subgroups of patients. Our method outperforms the aforementioned state-of-the-art classification methods in terms of AUC, with an AUC of 95.25% (95% CI, 91.09–98.23) in grade II, an AUC of 98.26% (95% CI, 95.22–100.00) in grade III, an AUC of 93.83% (95% CI, 86.57–99.12) in grade IV, an AUC of 98.90% (95% CI, 97.46–99.94) in single infiltration (infiltrating brain area count), an AUC of 91.48% (95% CI, 84.27–97.08) in double infiltration and an AUC of 100% (95% CI, 100–100) in triple infiltration.

Table 2 Classification performance of the various models on both the validation and independent test set

Full size table

Table 3 Classification performance of the various models in single and multiple regions on both the validation and independent test set

Full size table

We also integrated clinical features into our model, specifically two demographic characteristics (age and sex) and two clinical data variables (glioma grade and Karnofsky performance score). The results presented in the table demonstrate an improvement of 2.73% in AUC, 6.21% in accuracy, 4.77% in sensitivity, and 6.8% in specificity compared to our original MR-only model for all grades on validation set.

Model performance of tumor segmentation

The performance of five different tumor segmentation methods, including U-net, nnUnet, COM-Net, MTTU- Net and our method was evaluated on the validation and independent test sets. The related results are presented Table 4 and Fig. 5, and the findings indicate that the proposed method outperforms the other four methods in terms of DSC for all grades of tumors on both the validation and independent test sets. Specifically, Fig. 5 shows that the proposed method achieved the highest overlap between the ground truth and the predicted segmentation, as indicated by the green curve. As shown in Fig. 5, our method achieves the highest overlap between the ground truth (red curve) and the predicted segmentation (green curve). Table 4 provides the numerical results of the evaluation, and it shows that the proposed method achieved the optimal DSC for all grades of tumors, including grade II, III, and IV. The DSC for the proposed method were 87.60% for all grades, which were higher than the corresponding scores for single-task methods (U-Net and nnUnet) and multi-task methods (COM-Net and MTTU-Net). Specifically, it achieved a DSC o 88.50% for grade II, 85.44% for grade III, and 88.20% for grade IV on the independent test set.

Table 4 Segmentation performance of the compared methods on both the validation and independent test set

Full size table

Model performance of glioma-infiltrated brain area identification vs. experts

To compare the accuracy of infiltrating tumor lobe classification by human and the model, we conducted a comparison experiment with a random sample of 50% in independent test set. Two experts dedicated to annotate glioma-infiltrated brain lobes based on enhanced T1 and T2-FLAIR MRI data (X.M.L and M.L with 10 years and 4 years of experience, respectively). They consecutively and independently evaluated the MR data from independent test set. The results of our model have been binarized for fair comparison. Table 5 shows performance of identifying glioma-infiltrated brain areas with AUCs of 61.58% (95% CI, 52.38–70.33), 57.21% (95% CI, 48.10–66.49) and 85.30% (95% CI, 77.71–91.70) for two experts and our model. The time spent in each case was 50 s and 2 min, respectively, which was much higher than the 0.4 s of the model. As shown in Fig. 6, the experts' results show largely individual differences. The two experts had a high sensitivity to the frontal lobe, while they could not discriminate well in the parietal, occipital, and temporal lobes. In contrast, although our model cannot achieve optimal performance in every category, it was able to achieve a higher level of discrimination in each brain lobe, with a better overall ability.

Table 5 Model performance of glioma-infiltrated brain area identification vs. experts

Full size table

Visualization of local and global information of our model

To validate the extraction of specific global and local features, we compare the performance of the hybrid CNN and Transformer network with a pure CNN network using guided backpropagation [26] for visual interpretation. The results of this comparison were shown in Fig. 7. Given an input image, we perform the forward pass to the last convolutional layer we are interested in, then set to zero all activation except one and propagate back to the image to get a reconstruction. As shown in Fig. 7, the Guided Backpropagation map (Guided Backprop) reflects the pixels where the network is focused, with white indicating high attention and black low attention.

The pure CNN network shows attention to local tumor information, as evidenced by the high intensity in the tumor region (Fig. 7(c)). In contrast, the hybrid model, as shown in the Fig. 7(b), exhibits high attention regions associated with tumors and brain edges. This indicates that the hybrid model effectively learns global features, specifically the relative position relationship between tumors and brain edges. The corresponding class activation maps of the hybrid model provide insights into the decision regions when predicting infiltrating brain regions. By combining information from both global and local features, the hybrid model maximizes the advantages of each. The global features provide important information about the relative positions of tumors and brain edges, while the local features focus on specific tumor regions. This combination allows the hybrid model to achieve more accurate predictions of infiltrating brain regions.

Discussion

In this study, we have developed a method that can predict the glioma-infiltrated brain areas while simultaneously achieving the tumor segmentation. Our approach incorporated the CNN and transformer network to leverage both local and global features, thus enhancing the performance of classification and segmentation.

Our method can automatically identify the tumor-infiltrated brain areas and provide a whole tumor segmentation, which can aid in determining the appropriate surgical approach and predict the prognosis [14, 15]. When the tumor infiltrates critical brain areas, the prognosis may worsen due to the increased risk of neurological deficits and complications. Our method provides an alternative for marking the specific tumor location. Besides, the predicted whole tumor segmentation can facilitate tumor volumetric measurement [5, 27], radiomics [28], or radiogenomics [29]. With short running time and reliable information about the areas of glioma infiltration and segmentation, our model can reduce the burden of clinicians and improve the efficiency of making more informed decisions about treatment strategies. In addition, the model's ability to perform these tasks based solely on MR images streamlines workflow and reduces reliance on other procedures. This makes the model readily applicable to a variety of clinical settings where MRI is routinely performed. The proposed model has the potential to improve patient outcomes, optimize treatment strategies, and contribute to the overall management of glioma patients.

The precise characterization of glioma is critical for clinical treatment plans [11]. Previous studies [5, 7, 8, 10, 11, 27] have developed deep learning methods for predicting the genetic or histological features of glioma, or for automatically delineating the tumor. Additionally, providing accurate information about the location of tumor-infiltrated brain areas is crucial for clinical decision making. Studies have shown the incidence of gliomas is related to anatomic locations [14, 30]. To the best of our knowledge, few studies have focused on achieving the identification of tumor-infiltrated brain areas and segmentation of the tumor at the same time. We develop a multi-task network to learn the relevant information between different tasks, enabling us to achieve both goals.

In contrast to molecular subtyping or grading of glioma, where each patient has a single label for a task, our infiltrated brain area identification task may have one or more labels per patient. Directly adapting the commonly used deep learning-based multi-class classification models, such as ResNet50 and VGG16, to the multi-label task may not be effective due to the irregular shape of gliomas. While adding multi-sequences of MRI images can help improve the performance of infiltrated brain area identification, the improvement is limited without the prior information about the glioma boundary. There are several factors that may explain our proposed method achieved higher performance of identifying glioma- infiltrated brain areas compared with other methods. First, unlike the aforementioned methods that perform classification and segmentation separately, our method uses a single encoder to extract the task-relevant features. Learning shared representations across different tasks can help prevent overfitting on each task and improve overall performance on all tasks. Second, our method co-optimizes the loss functions of brain area identification and tumor segmentation, improving the model's generalization ability and potentially solving label imbalance problems via shared label information. Third, transformer blocks were embedded into CNN-based network to address the intrinsic locality limitation of convolution operations.

Despite the superior performance of the validation and independent test sets, our study has several limitations. The main drawback is its retrospective nature, which makes it vulnerable to selection and recall bias, thereby potentially impacting the reliability and generalizability of the study findings. Our future direction encompasses the implementation of prospective studies, enabling the longitudinal follow-up of patients from the onset and facilitating the collection of real-time data for precise and comprehensive analysis. This approach aims to enhance the validity and robustness of our model while augmenting its applicability in clinical settings. Besides, advanced MRI sequences, such as diffusion or perfusion weighted imaging MRI, were not included in this study. While these sequences are not the clinical imaging routine for structural scans. Our results in Tables 2 and 3 indicate that superior performance can be achieved by using the T2-FLARI and T1c MRI. This method makes it easier to collect and obtain data and makes the model simpler but it may still limit the performance of the model. Given the increasing use of these advanced MRI sequences in clinical practice, we believe that integrating them into our future research will further improve the model's accuracy. It is essential to note that our study is limited to brain area identification and tumor segmentation by developing a single model. However, we acknowledge the potential of our model in other glioma-related prediction. In our future work, we aim to integrate the prediction of 1p/19q co-deletions, IDH mutations, and molecular subtyping to provide a more comprehensive analysis. By integrating these additional factors, we can deepen our understanding of gliomas at the molecular level and contribute to personalized treatment strategies and rognostications, enhancing the overall management of patients with gliomas.

In addition, we acknowledge the significance of detailed structural labeling, including deeper brain structures like the internal capsule. Future research will focus on enhancing the capabilities of our model to enable more accurate and detailed labeling of brain structure and function. We will incorporate advanced techniques and expand the scope of our model to encompass a broader range of anatomical and functional regions.

Conclusion

In conclusion, we have developed a transformer-based multi-task deep learning model that can perform two critical tasks: identifying tumor-infiltrated areas of the brain and segmenting the tumor based on preoperative MR scans. Our model has demonstrated superior performance compared to state-of-the-art classification and segmentation models on both validation and independent test sets. It is believed that our model holds significant potential as a practical tool to assist clinicians in accurately identifying glioma-infiltrated areas and supporting treatment decisions.

Availability of data and materials

None.

Abbreviations

AUC:: Area under the ROC curve
CI:: Confidence intervals
ACC:: Accuracy
SEN:: Sensitivity
SPE:: Specificity
WHO:: World Health Organization
MRI:: Magnetic resonance imaging
T2-FLAIR:: T2-fluid-attenuated inversion recovery
T1c:: Post-contrast T1-weighted MR
CNN:: Convolutional neural network

References

Broekman ML, Maas SL, Abels ER, Mempel TR, Krichevsky AM, Breakefield XO. Multidimensional communication in the microenvirons of glioblastoma. Nat Rev Neurol. 2018;14(8):482–95.
Article PubMed PubMed Central Google Scholar
Wen PY, Weller M, Lee EQ, Alexander BM, Barnholtz-Sloan JS, Barthel FP, Batchelor TT, Bindra RS, Chang SM, Chiocca EA, et al. Glioblastoma in adults: a society for neuro-oncology (sno) and european society of neuro-oncology (eano) consensus review on current management and future directions. Neuro Oncol. 2020;22(8):1073–113.
Article CAS PubMed PubMed Central Google Scholar
Louis DN, Perry A, Reifenberger G, Von Deimling A, Figarella-Branger D, Cavenee WK, Ohgaki H, Wiestler OD, Kleihues P, Ellison DW. The 2016 world health organization classification of tumors of the central nervous system: a summary. Acta Neuropathol. 2016;131:803–20.
Article PubMed Google Scholar
Louis DN, Perry A, Wesseling P, Brat DJ, Cree IA, Figarella-Branger D, Hawkins C, Ng H, Pfister SM, Reifenberger G, et al. The 2021 who classification of tumors of the central nervous system: a summary. Neuro Oncol. 2021;23(8):1231–51.
Article CAS PubMed PubMed Central Google Scholar
Chang K, Beers AL, Bai HX, Brown JM, Ly KI, Li X, Senders JT, Kavouridis VK, Boaro A, Su C, et al. Automatic assessment of glioma burden: a deep learning algorithm for fully automated volumetric and bidimensional measurement. Neuro Oncol. 2019;21(11):1412–22.
Article PubMed PubMed Central Google Scholar
Grossman R, Shimony N, Shir D, Gonen T, Sitt R, Kimchi TJ, Harosh CB, Ram Z. Dynamics of flair volume changes in glioblastoma and prediction of survival. Ann Surg Oncol. 2017;24:794–800.
Article PubMed Google Scholar
Liu Z, Tong L, Chen L, Zhou F, Jiang Z, Zhang Q, Wang Y, Shan C, Li L, Zhou H. Canet: Context aware network for brain glioma segmentation. IEEE Trans Med Imaging. 2021;40(7):1763–77.
Article PubMed Google Scholar
Swinburne NC, Yadav V, Kim J, Choi YR, Gutman DC, Yang JT, Moss N, Stone J, Tisnado J, Hatzoglou V, et al. Semisupervised training of a brain mri tumor detection model using mined annotations. Radiology. 2022;303(1):80–9.
Article PubMed Google Scholar
Wang Y, Wang Y, Guo C, Zhang S, Yang L. Sgpnet: A three-dimensional multitask residual framework for segmentation and idh genotype prediction of gliomas. Comput Intell Neurosci. 2021;2021:1–9.
Article Google Scholar
Cheng J, Liu J, Kuang H, Wang J. A fully automated multimodal mri-based multi-task learning for glioma segmentation and idh genotyping. IEEE Trans Med Imaging. 2022;41(6):1520–32. https://doi.org/10.1109/TMI.2022.3142321.
Article PubMed Google Scholar
van der Voort SR, Incekara F, Wijnenga MM, Kapsas G, Gahrmann R, Schouten JW, Nandoe Tewarie R, Lycklama GJ, De Witt Hamer PC, Eijgelaar RS, et al. Combined molecular subtyping, grading, and segmentation of glioma using multi-task deep learning. Neuro Oncol. 2023;25(2):279–89.
Article PubMed Google Scholar
Xue, Z., Xin, B., Wang, D., Wang, X.: Radiomics-enhanced multi-task neural network for non-invasive glioma subtyping and segmentation. In: International Workshop on Radiomics and Radiogenomics in Neuro-oncology, 81–90 (2019). Springer.
Weninger, L., Liu, Q., Merhof, D.: Multi-task learning for brain tumor segmentation. In: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 5th International Workshop, BrainLes 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, October 17, 2019, Revised Selected Papers, Part I 5, 327–337 (2020). Springer.
Larjavaara S, Mantyla R, Salminen T, Haapasalo H, Raitanen J, Jaaskelainen J, Auvinen A. Incidence of gliomas by anatomic location. Neuro Oncol. 2007;9(3):319–25.
Article PubMed PubMed Central Google Scholar
Mackintosh C, Butterfield R, Zhang N, Lorence J, Zlomanczuk P, Bendok BR, Zimmerman RS, Swanson K, Porter A, Mrugala MM. Does location matter? characterisation of the anatomic locations, molecular profiles, and clinical features of gliomas. Neurol Neurochir Pol. 2020;54(5):456–65.
Article PubMed Google Scholar
Przybylowski CJ, Hervey-Jumper SL, Sanai N. Surgical strategy for insular glioma. J Neurooncol. 2021;151:491–7.
Article PubMed PubMed Central Google Scholar
Sidaway P. Low-grade glioma subtypes revealed. Nat Rev Clin Oncol. 2020;17(6):335–335.
Article PubMed Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. p. 770–8.
Google Scholar
Tan M, Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks. International Confer-ence on Machine Learning. 2019. p. 6105–14 (PMLR).
Google Scholar
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. p. 2818–26.
Google Scholar
Tustison NJ, Avants BB, Cook PA, Zheng Y, Egan A, Yushkevich PA, Gee JC. N4itk: improved n3 bias correction. IEEE Trans Med Imaging. 2010;29(6):1310–20.
Article PubMed PubMed Central Google Scholar
Jenkinson M, Beckmann CF, Behrens TE, Woolrich MW, Smith SM. Fsl. Neuroimage. 2012;62(2):782–90.
Article PubMed Google Scholar
Avants BB, Tustison N, Song G, et al. Advanced normalization tools (ants). Insight J. 2009;2(365):1–35.
Google Scholar
Zhao X, Wu Y, Song G, Li Z, Zhang Y, Fan Y. A deep learning model integrating fcnns and crfs for brain tumor segmentation. Med Image Anal. 2018;43:98–111.
Article PubMed Google Scholar
Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.A.: Striving for simplicity: The all convolutional net. CoRR abs/1412.6806 (2014)
Visser M, M¨uller D, van Duijn R, Smits M, Verburg N, Hendriks E, Nabuurs R, Bot J, Eijgelaar R, Witte M, et al. Inter-rater agreement in glioma segmentations on longitudinal mri. NeuroImage Clin. 2019;22:101727.
Article CAS PubMed PubMed Central Google Scholar
Li G, Li L, Li Y, Qian Z, Wu F, He Y, Jiang H, Li R, Wang D, Zhai Y, et al. An mri radiomics approach to predict survival and tumour-infiltrating macrophages in gliomas. Brain. 2022;145(3):1151–61.
Article PubMed PubMed Central Google Scholar
Choi YS, Bae S, Chang JH, Kang S-G, Kim SH, Kim J, Rim TH, Choi SH, Jain R, Lee S-K. Fully automated hybrid approach to predict the idh mutation status of gliomas via deep learning and radiomics. Neuro Oncol. 2021;23(2):304–13.
Article CAS PubMed Google Scholar
Numan T, Breedt LC, Maciel BDA, Kulik SD, Derks J, Schoonheim MM, Klein M, de Witt Hamer P C, Miller JJ, Gerstner ER, et al. Regional healthy brain activity, glioma occurrence and symptomatology. Brain. 2022;145(10):3654–65.
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The authors thank Yuankui Wu, PhD (Department of Medical Imaging Center, Nanfang Hospital, Southern Medical University) for re-checking the manual delineated labels required for model training.

Funding

This work was partially supported by the National Natural Science Foundation of China (No.62101239), and the Guangdong Basic and Applied Basic Research Foundation under grants (No. 2023A1515012242 and No. 2023A1515011291).

Author information

Yin Li, Kaiyi Zheng, Shuang Li contributed equally to this work.

Authors and Affiliations

Department of Information, The Sixth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, China
Yin Li & Yongju Yi
School of Biomedical Engineering, Southern Medical University, Guangzhou, China
Kaiyi Zheng, Liming Zhong & Wei Yang
Guangdong Provincial Key Laboratory of Medical Image Processing, Guangzhou, China
Kaiyi Zheng, Liming Zhong & Wei Yang
Department of General Practice, The Sixth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, China
Shuang Li & Lin Yao
Department of Radiology, Zhujiang Hospital, Southern Medical University, Guangzhou, China
Min Li, Yufan Ren, Congyue Guo & Xinming Li

Authors

Yin Li
View author publications
You can also search for this author in PubMed Google Scholar
Kaiyi Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Shuang Li
View author publications
You can also search for this author in PubMed Google Scholar
Yongju Yi
View author publications
You can also search for this author in PubMed Google Scholar
Min Li
View author publications
You can also search for this author in PubMed Google Scholar
Yufan Ren
View author publications
You can also search for this author in PubMed Google Scholar
Congyue Guo
View author publications
You can also search for this author in PubMed Google Scholar
Liming Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Wei Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xinming Li
View author publications
You can also search for this author in PubMed Google Scholar
Lin Yao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Y. Li: Methodology, Software, Writing—original draft; K. Zheng: Methodology, Software, Writing—original draft; S. Li: Methodology, Software, Writing—original draft; Y. Yi: Writing—review & editing; Min. Li: Data curation; Y. Ren: Data curation; C. Guo: Data curation; L. Zhong: Writing—review & editing; W. Yang: Writing—review & editing; X. Li: Data curation, writing—review & editing; Y. Lin: Conceptualization, Validation, Writing -review & editing, Supervision.

Corresponding authors

Correspondence to Xinming Li or Lin Yao.

Ethics declarations

Ethics approval and consent to participate

Institutional Review Board approval was obtained.

Consent for publication

All authors named in this manuscript gave their consent for this publication and take full responsibility for its content. All authors read and approved the final manuscript.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Li, Y., Zheng, K., Li, S. et al. A transformer-based multi-task deep learning model for simultaneous infiltrated brain area identification and segmentation of gliomas. Cancer Imaging 23, 105 (2023). https://doi.org/10.1186/s40644-023-00615-1

Download citation

Received: 20 April 2023
Accepted: 20 September 2023
Published: 27 October 2023
DOI: https://doi.org/10.1186/s40644-023-00615-1

A transformer-based multi-task deep learning model for simultaneous infiltrated brain area identification and segmentation of gliomas

Abstract

Background

Methods

Results

Conclusions

Introduction

Materials and methods

Patients population

Data pre-processing

Network details

Data partitioning and network implementation

Statistical analysis

Results

Patient characteristics

Model performance of glioma-infiltrated brain area identification

Model performance of tumor segmentation

Model performance of glioma-infiltrated brain area identification vs. experts

Visualization of local and global information of our model

Discussion

Conclusion

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Cancer Imaging

Contact us