2033 - Comparative Analysis of Uncertainty Quantification Models in Deep Learning for Dose Prediction of Radiotherapy
Presenter(s)
L. Chen, Z. Wang, T. Zhang, W. Wang, X. Sun, J. Duan, Y. Gao, Z. An, and L. N. Zhao; Department of Radiation Oncology, Xijing Hospital, Fourth Military Medical University, Xi'an, Shaanxi, China
Purpose/Objective(s): Recently, deep learning techniques have made substantial advancements in expediting radiotherapy treatment planning, particularly in dose predictions. As a crucial part of the clinical workflow, the confidence of these predictions has gained increasing significance. However, conventional deep learning methods often lack the capability to provide uncertainty estimates, leading to issues of over-confidence or under-confidence. To address these limitations, different types and sources of uncertainty in dose predictions have been identified, and various approaches have been proposed to quantify the uncertainty.
Materials/Methods: We categorize the most significant sources of uncertainty into reducible model uncertainty and irreducible data uncertainty. To model and quantify these uncertainties, four approaches are introduced: direct uncertainty prediction with neural networks (DUP), Bayesian neural networks (BNNs), ensemble of neural networks (ENNs), and test-time data augmentation (TTDA). The DUP method employs two neural networks, one for dose prediction and another for predicting the uncertainty of the first network's predictions. BNNs integrate Bayesian learning principles into deep neural networks and utilize Monte Carlo dropout for approximate posterior inference. ENNs combine predictions from multiple neural networks during inference. TTDA methods generate several predictions by augmenting the input data at test-time, using a single neural network, to assess the prediction's certainty. These models are trained and evaluated using a public head and neck cancer dataset from the OpenKBP 2020 AAPM Challenge.
Results: The results demonstrate that all models are capable of generating uncertainty in the predictive dose. Among them, ENNs exhibit statistically significant reductions in loss value and errors across most metrics. Specifically, ENNs outperform other models in clinical-related DVH dosimetric metrics, with mean absolute error (MAE) values of 2.34 for D99, 1.64 for D95, 1.96 for D1, 1.95 for Dmean, 1.99 for D0.1cc, respectively. Although ENNs achieve the best performance, their high computational cost, both in terms of time and memory, during training and inference should be considered when deciding whether to use them. From this aspect, DUP and TTDA approaches may be more favorable options due to their higher computational efficiency compared to ENNs.
Conclusion: This paper presents systematic experimentation to guide researchers in selecting models and algorithms for quantifying uncertainty in deep learning-based dose prediction for radiotherapy. The study offers insights into the strengths and limitations of various approaches, including DUP, BNNs, ENNs, TTDA, which can assist researchers in making informed decisions when choosing the most suitable model or algorithm for their specific needs in radiotherapy dose prediction.