Main Session
Sep 28
PQA 01 - Radiation and Cancer Physics, Sarcoma and Cutaneous Tumors

2176 - Comparison of Adult Autocontouring Models to Manual Delineation of Organs at Risk for Pediatric Patients

02:30pm - 04:00pm PT
Hall F
Screen: 13
POSTER

Presenter(s)

Elizabeth McKone, MD Headshot
Elizabeth McKone, MD - Mayo Clinic Rochester, Rochester, MN

E. L. McKone1, K. M. Frechette2, S. Armstrong3, M. R. Pringle3, N. Johnson4, A. J. Kehren3, L. M. Undahl3, M. Swain3, A. Goodrich3, P. J. Dizona5, V. Malkov1, D. J. Moseley3, N. N. Laack II1, and A. Mahajan1; 1Department of Radiation Oncology, Mayo Clinic, Rochester, MN, 2Department of Radiation Oncology, Mayo Clinic Rochester, Rochester, MN, 3Mayo Clinic, Rochester, MN, 4Mayo Clinic Rochester, Rochester, MN, 5Department of Clinical Trials and Biostatistics, Mayo Clinic, Rochester, MN

Purpose/Objective(s): Utilization of automatic segmentation technology during radiation treatment planning has led to increased efficiency and accuracy for organ at risk (OAR) delineation. However, it is unclear if adult trained autosegmentation models are accurate for pediatric OARs. Varying body and organ size and proportions over developmental stages pose unique challenges within this population. Our goal is to assess the feasibility and accuracy of previously adult trained autosegmentation models applied to a pediatric cohort.

Materials/Methods: Eight patients (4 male, 4 female) from each of three age groups (2.5–5.9, 6–13.9, and 14–20 years) with CNS primary malignancies and full-body axial imaging used for CSI planning were identified from our institutional database. Forty-five OARs were manually contoured (the gold standard) and auto-contoured using 5 commercially available and in-house artificial intelligence (AI) autosegmentation models for each patient scan. Comparisons were made between manual and auto-contours using quantitative surface dice similarity coefficient analysis on scans with a corrected 1 mm slice thickness for each patient, OAR, and model. Median surface dice scores of =0.8 were considered clinically acceptable. Performance across three age groups was assessed for each model and OAR with a Kruskall Wallis Rank Sum Test with p<0.05 considered significant.

Results: There were 60 discrete AI model and OAR combinations, with 11 OARs auto-contoured by more than one model. The remaining 34 OARs were contoured by a single model. Clinically acceptable surface dice scores were seen for 17/60 (28.3%) OAR/model pairs. Median surface dice scores among the entire cohort were <0.5 for 5 OARs: oral cavity, pancreas, thyroid, larynx, and spinal cord. Median surface dice scores of =0.9 were seen for 6 OARs: brain, left cochlea, bilateral lenses, mandible, and right lung. Significant differences between age categories were seen across 11/60 (18.3%) discrete OAR/model pairs for larynx, left carotid artery, right kidney, bilateral lungs, bilateral femoral heads, and esophagus, with underperformance noted within the youngest age group and improvement in surface dice scores as patients approach adulthood among all except the right kidney

Conclusion: In a real-world analysis using adult trained autosegmentation models across a pediatric population, clinically acceptable OAR contours were only achieved in 28.3% instances, with the remainder likely requiring editing and/or manual re-contouring. Differences between age categories occurred in 18.3%, typically underperforming in the youngest age group of children <6 years old. This suggests limited utility in applying adult trained autosegmentation models in a pediatric population. Therefore, development of pediatric trained models and/or training of more robust models capable of performing in a broader range of age groups is needed.