ChatGPT-5 Matches Surgeon-Level Assessment of Facelift Candidacy: A Pilot Proof-of-Concept Study.

Aesthetic plastic surgery 2026

Saeed AT, Breakey RWF, Saleh DB, Tiryaki KT, Mayou BJ, Saeed TM

관련 도메인

Abstract

[BACKGROUND] The use of multimodal artificial intelligence (AI) in plastic surgery is steadily increasing. Whether a general-purpose multimodal AI tool can, from photographs alone, assess facial aging and facelift candidacy at a level comparable to board-certified specialist plastic surgeons remains unknown.

[OBJECTIVES] To determine if ChatGPT-5 (OpenAI, San Francisco, CA, USA) can identify facial aging features, stratify severity, and judge facelift candidacy from photographs alone, compared with board-certified plastic surgeons.

[METHODS] Two-center observational pilot. Twenty-two volunteers (mean age 42.0 ± 16.8 years; median 34 years; range 24-80) provided standardized four-view facial composite photographs. Five board-certified plastic surgeons independently completed an eight-item questionnaire per case. ChatGPT-5 assessed the same images with identical wording. Assessments were image-only and blinded (no demographics/history). Surgeon consensus was defined by plurality. Primary outcomes were agreement and Cohen's κ; for ordinal items, weighted κ, Spearman's ρ, and mean absolute error (MAE) were reported. McNemar's test assessed discordance for binary items.

[RESULTS] For facelift candidacy, agreement was 95.5% (21/22; Cohen's κ = 0.91; McNemar P = 1.00). For binary aging features, agreement ranged from 81.8 to 90.9% (κ ≈ 0.61 to 0.81). For ordinal severity (lower face and midface), exact agreement was 77.3%, disagreements were adjacent only, weighted κ = 0.74 to 0.86, Spearman's ρ = 0.84 (P < .001). Inter-surgeon agreement on ordinal items was moderate to fair. For the adjunct-procedure recommendation, Top-1 accuracy was 70.6% (12/17; κ = 0.58) and Top-2 agreement was 77.3% (17/22).

[CONCLUSIONS] In a blinded, standardized-photograph setting, ChatGPT-5 matched surgeons on binary facelift candidacy assessment and closely tracked severity grading with small, one-level differences at most. These findings may support use as a decision-support tool (triage, patient education) while surgeons retain hands-on examination and personalized planning. Larger, multicenter studies with more diverse image datasets are warranted to confirm generalizability and define deployment standards.

[LEVEL OF EVIDENCE IV] This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .

추출된 의학 개체 (NER)

유형영어 표현한국어 / 풀이UMLS CUI출처등장
시술 facelift 안면거상술 dict 5
합병증 four-view facial scispacy 1
약물 [BACKGROUND] scispacy 1
약물 [OBJECTIVES] scispacy 1
약물 [CONCLUSIONS] In scispacy 1
질환 Top-1 scispacy 1
질환 standardized-photograph scispacy 1
기타 Top-1 scispacy 1
기타 patient scispacy 1

📑 인용 관계

🔗 함께 등장하는 도메인

이 논문이 속한 카테고리와 같은 논문에서 자주 함께 다뤄지는 카테고리들

관련 논문