Main Session
Sep 28
PQA 01 - Radiation and Cancer Physics, Sarcoma and Cutaneous Tumors

2138 - Feasibility of Weighting for Physician Satisfaction Based on Correlation between an In-House Developed Physician-Blind Test and Geometric Similarity in Heart Auto-Segmentation

02:30pm - 04:00pm PT
Hall F
Screen: 26
POSTER

Presenter(s)

Jae Choon Lee, MS - Korea University Medicine, Anam Hospital, Seoul, -1

J. C. Lee1, E. J. Heo2, S. H. Cho3, D. Lee4, K. H. Chang5, J. B. Shim1, C. Y. Kim3, N. K. Lee3, and S. Lee3; 1Department of Medical Physics, Kyonggi University, Suwon-si, Korea, Republic of (South), 2Department of Medical Physics, Graduate School of Korea University, Sejong, Korea, Republic of (South), 3Department of Radiation Oncology, College of Medicine, Korea University, Seoul, Korea, Republic of (South), 4Department of Sales and CS, OncoSoft, Seoul, Korea, Republic of (South), 5Department of Radiologic Science, Far East University, Chungcheongbuk-do, Korea, Republic of (South)

Purpose/Objective(s): This study aims to evaluate the correlation between an in-house developed physician-blind test and geometric similarity metrics to determine weighting for physician satisfaction in heart auto-segmentation.

Materials/Methods: Retrospective data (CT images and structure setS) from 30 breast cancer patients at our institution were obtained. A radiation oncologist with extensive experience contoured the heart, left and right lungs, esophagus, and thyroid according to RTOG guidelines. We analyzed geometric similarity for fully convolutional DenseNet (FCDN) model in terms of DSC, HD95, and MSD using commercial auto-segmentation software. To assess physician satisfaction, we analyzed an in-house developed physician-blind template by observers (radiation oncologist, radiologist, medical physicist, and dosimetrist). Total scores were calculated based on 8 questions including missing slices, whether the heart was sufficiently delineated in a single slice and across all slices, borders, chambers, great vessels, and coronary arteries. The physician blind test was scored on a scale from "Unacceptable with major corrections" (score 0-3), "Acceptable with minor corrections" (score 4-6), to "Acceptable with no corrections" (score 7-8). To assess the feasibility of applying weighting, we analyzed the Pearson correlation between physician-blind test scores for each question and each geometric similarity index. The Pearson coefficient (r) ranged from 0.7 to 0.9, indicating a high positive correlation; from 0.5 to 0.7, indicating a moderate positive correlation; from 0.3 to 0.5, indicating a low positive correlation; and from 0.0 to 0.3, indicating a negligible correlation. Statistical analysis was performed using the Wilcoxon signed-rank test (p<0.05).

Results: For geometrical similarity, DSC, MSD, and HD95 were 0.95 ± 0.01, 1.45 ± 0.60, and 5.13 ± 1.67 mm, respectively. For physician-blind test, the scores from Question 1 to 8 were 0.83 ± 0.29, 0.76 ± 0.38, 1.00 ± 0.00, 0.98 ± 0.07, 0.66 ± 0.15, 0.83 ± 0.20, 0.71 ± 0.20, and 1.00 ± 0.02, respectively. We found a moderate correlation between DSC and the question for the left atrium (Q6-1), as well as between HD95 and the question for the left atrium (Q6-1) (Q6-1: rDSC=0.63, rHD95=0.59). For the Pearson coefficient, we found a low correlation between HD95 and the question for the lateral border (Q5-5), between MSD and the question for the left atrium (Q6-1), between HD95 and the question for the ascending aorta (Q7-1), and between HD95 and the question for the pulmonary artery (Q7-2) (Q5-5: rHD95=0.42; Q6-1: rMSD=0.45; Q7-1: rHD95=0.30; Q7-2: rHD95=0.30).

Conclusion: We demonstrated the feasibility of applying weighting for physician satisfaction by analyzing the correlation between an in-house developed physician-blind test and geometric similarity in heart auto-segmentation. Future research should focus on assessing the relative importance of individual test questions and developing an optimized scoring system for clinical use.