← 뒤로

Evaluating Large Language Models as Medical Consultation Tools for Double Eyelid Surgery: A Cross-Language Study in English and Chinese.

Aesthetic plastic surgery 2026 Vol.50(5) p. 1706-1716

원문 ↗ DOI ↗

Abstract

[BACKGROUND] Double eyelid surgery is a common cosmetic procedure that creates a crease in the upper eyelid. Due to insufficient understanding of the procedure, numerous consultations have emerged, placing a heavy burden on plastic surgeons. The rise of large language models (LLMs) offers a potential solution to this issue.

[METHODS] This study collected sixteen questions commonly of concern to individuals seeking the surgery via an online questionnaire and assessed the efficacy of fifteen popular LLMs in answering these questions with both English and Chinese inputs. All responses from the LLMs were scored multidimensionally by three expert eyelid plastic surgeons across dimensions including professionalism, patient friendliness, informativeness, practicality, and logical clarity. The scoring results were statistically analyzed using the Friedman test and Nemenyi post-hoc test.

[RESULTS] With English input, ERNIE-Bot, ChatGPT-4o, and Gemini-2.0-Flash consistently ranked among the top three across most evaluation dimensions. In contrast, Claude-3.7-Sonnet, HuatuoGPT, ZoeGPT, CompliantGPT, and BastionGPT ranked lower across all dimensions, with performance significantly lagging behind the top performers. For Chinese input, DeepSeek-R1 maintained a leading position across all dimensions, forming the first tier alongside DeepSeek-V3, Gemini-2.0-Flash, and ERNIE-Bot. Meanwhile, Claude-3.5-Haiku, ZoeGPT, Llama3.3-70B-Instruct, CompliantGPT, HuatuoGPT, and BastionGPT ranked lower in multiple dimensions, with a significant gap relative to first-tier models.

[CONCLUSION] This study demonstrated LLMs' potential as medical consultation tools for double eyelid surgery, providing useful guidance for both English and Chinese users. Future research should focus on fine-tuning LLMs with more specialized medical data and exploring workflows for surgeon-LLM collaboration to validate their clinical utility.

[LEVEL OF EVIDENCE V] This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .

추출된 의학 개체 (NER)

유형	영어 표현	한국어 / 풀이	출처	등장
해부	`eyelid`	눈꺼풀	dict	4
시술	`double eyelid`	안검성형술	dict	3
해부	`upper eyelid`	눈꺼풀	dict	1
해부	`crease`		scispacy	1
해부	`ERNIE-Bot`		scispacy	1
합병증	`eyelid plastic`		scispacy	1
약물	`[BACKGROUND] Double`		scispacy	1
약물	`BastionGPT`		scispacy	1
질환	`Language`		scispacy	1
기타	`patient`		scispacy	1
기타	`CompliantGPT`		scispacy	1

MeSH Terms

Humans; Blepharoplasty; Language; Referral and Consultation; Surveys and Questionnaires; Female; Eyelids; Male; China; Adult; Middle Aged; Surgery, Plastic; Large Language Models; East Asian People

📑 인용 관계

이 논문이 참조한 문헌 28

외부 PMID 21건 (DB 미수집)

🔗 함께 등장하는 도메인

이 논문이 속한 카테고리와 같은 논문에서 자주 함께 다뤄지는 카테고리들

피판재건술 (123) 안면거상술 (83) 경결막 접근 (74) 코성형술 (67) 비중격 (51) 상안검거근 (45) 유방성형술 (35) 감염 (31)