본문으로 건너뛰기
← 뒤로

Integrating Generative AI into Nomograms for Breast Cancer Nodal Risk Predictions.

Annals of surgical oncology 2026

Shah A, Love JA, Butler M, Butler LN, Mittendorf EA, King TA, Park KU

📝 환자 설명용 한 줄

[BACKGROUND] Nomograms predicting the likelihood of sentinel lymph node (SLN) metastasis in early-stage breast cancer can aid surgical decision-making but are underused due to the burden of variable r

이 논문을 인용하기

BibTeX ↓ RIS ↓
APA Shah A, Love JA, et al. (2026). Integrating Generative AI into Nomograms for Breast Cancer Nodal Risk Predictions.. Annals of surgical oncology. https://doi.org/10.1245/s10434-026-19482-8
MLA Shah A, et al.. "Integrating Generative AI into Nomograms for Breast Cancer Nodal Risk Predictions.." Annals of surgical oncology, 2026.
PMID 41824209

Abstract

[BACKGROUND] Nomograms predicting the likelihood of sentinel lymph node (SLN) metastasis in early-stage breast cancer can aid surgical decision-making but are underused due to the burden of variable review and input. This study evaluated whether OpenAI's large language models (LLMs) can extract information required for nomogram use and reproduce the estimated rate of SLN metastasis using the Memorial Sloan Kettering and MD Anderson nomograms.

[METHODS] The study analyzed the de-identified radiology and pathology notes for 20 patients. Three prompts were tested: (1) o1 prompted to generate SLN metastasis estimates without nomograms, (2) o1 prompted to use the nomogram from online calculators with and without serial corrections, and (3) GPT-4o prompted to use chain-of-thought reasoning from nomogram variables and the corresponding point values from a pictorial nomogram. Artificial intelligence (AI) estimates of SLN metastasis rates were compared with a physician-expert's manual use of nomograms.

[RESULTS] OpenAI o1 captured all clinical variables in 65% of cases without serial correction (94-95% of individual variables) and 80% of cases with serial correction (96-97% of individual variables), with tumor size most frequently misidentified. Agreement between LLM-generated and physician-calculated risk prediction was low (exact matches in 0-10% of cases, near agreement in 20-45% of cases), indicating moderate rater reliability (intraclass correlation coefficient [ICC], 0.57-0.62). The optimized GPT-4o prompt demonstrated greater agreement (exact matches in 25% and near agreement in 90% of cases) and reliability (ICC, 0.95).

[CONCLUSION] Large language models can reliably extract nomogram inputs from clinical notes but require task-specific prompt engineering for accurate SLN metastasis risk estimation. Currently, automating nomogram-based risk estimators with AI may not justify the significant resources required for optimization.

같은 제1저자의 인용 많은 논문 (5)