← 뒤로

ChatGPT-5 Matches Surgeon-Level Assessment of Facelift Candidacy: A Pilot Proof-of-Concept Study.

Aesthetic plastic surgery 2026

Saeed AT, Breakey RWF, Saleh DB, Tiryaki KT, Mayou BJ, Saeed TM

원문 ↗ DOI ↗

Abstract

[BACKGROUND] The use of multimodal artificial intelligence (AI) in plastic surgery is steadily increasing. Whether a general-purpose multimodal AI tool can, from photographs alone, assess facial aging and facelift candidacy at a level comparable to board-certified specialist plastic surgeons remains unknown.

[OBJECTIVES] To determine if ChatGPT-5 (OpenAI, San Francisco, CA, USA) can identify facial aging features, stratify severity, and judge facelift candidacy from photographs alone, compared with board-certified plastic surgeons.

[METHODS] Two-center observational pilot. Twenty-two volunteers (mean age 42.0 ± 16.8 years; median 34 years; range 24-80) provided standardized four-view facial composite photographs. Five board-certified plastic surgeons independently completed an eight-item questionnaire per case. ChatGPT-5 assessed the same images with identical wording. Assessments were image-only and blinded (no demographics/history). Surgeon consensus was defined by plurality. Primary outcomes were agreement and Cohen's κ; for ordinal items, weighted κ, Spearman's ρ, and mean absolute error (MAE) were reported. McNemar's test assessed discordance for binary items.

[RESULTS] For facelift candidacy, agreement was 95.5% (21/22; Cohen's κ = 0.91; McNemar P = 1.00). For binary aging features, agreement ranged from 81.8 to 90.9% (κ ≈ 0.61 to 0.81). For ordinal severity (lower face and midface), exact agreement was 77.3%, disagreements were adjacent only, weighted κ = 0.74 to 0.86, Spearman's ρ = 0.84 (P < .001). Inter-surgeon agreement on ordinal items was moderate to fair. For the adjunct-procedure recommendation, Top-1 accuracy was 70.6% (12/17; κ = 0.58) and Top-2 agreement was 77.3% (17/22).

[CONCLUSIONS] In a blinded, standardized-photograph setting, ChatGPT-5 matched surgeons on binary facelift candidacy assessment and closely tracked severity grading with small, one-level differences at most. These findings may support use as a decision-support tool (triage, patient education) while surgeons retain hands-on examination and personalized planning. Larger, multicenter studies with more diverse image datasets are warranted to confirm generalizability and define deployment standards.

[LEVEL OF EVIDENCE IV] This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .

추출된 의학 개체 (NER)

유형	영어 표현	한국어 / 풀이	출처	등장
시술	`facelift`	안면거상술	dict	5
합병증	`four-view facial`		scispacy	1
약물	`[BACKGROUND]`		scispacy	1
약물	`[OBJECTIVES]`		scispacy	1
약물	`[CONCLUSIONS] In`		scispacy	1
질환	`Top-1`		scispacy	1
질환	`standardized-photograph`		scispacy	1
기타	`Top-1`		scispacy	1
기타	`patient`		scispacy	1

📑 인용 관계

이 논문이 참조한 문헌 18

외부 PMID 15건 (DB 미수집)

🔗 함께 등장하는 도메인

이 논문이 속한 카테고리와 같은 논문에서 자주 함께 다뤄지는 카테고리들

안검성형술 (66) 코성형술 (55) 지방흡입 (43) 표재성근건막계 (40) 유방성형술 (35) 유방 (32) 피판재건술 (31) 혈종 (22)