Evaluating Large Language Models as Medical Consultation Tools for Double Eyelid Surgery: A Cross-Language Study in English and Chinese.
Abstract
[BACKGROUND] Double eyelid surgery is a common cosmetic procedure that creates a crease in the upper eyelid. Due to insufficient understanding of the procedure, numerous consultations have emerged, placing a heavy burden on plastic surgeons. The rise of large language models (LLMs) offers a potential solution to this issue.
[METHODS] This study collected sixteen questions commonly of concern to individuals seeking the surgery via an online questionnaire and assessed the efficacy of fifteen popular LLMs in answering these questions with both English and Chinese inputs. All responses from the LLMs were scored multidimensionally by three expert eyelid plastic surgeons across dimensions including professionalism, patient friendliness, informativeness, practicality, and logical clarity. The scoring results were statistically analyzed using the Friedman test and Nemenyi post-hoc test.
[RESULTS] With English input, ERNIE-Bot, ChatGPT-4o, and Gemini-2.0-Flash consistently ranked among the top three across most evaluation dimensions. In contrast, Claude-3.7-Sonnet, HuatuoGPT, ZoeGPT, CompliantGPT, and BastionGPT ranked lower across all dimensions, with performance significantly lagging behind the top performers. For Chinese input, DeepSeek-R1 maintained a leading position across all dimensions, forming the first tier alongside DeepSeek-V3, Gemini-2.0-Flash, and ERNIE-Bot. Meanwhile, Claude-3.5-Haiku, ZoeGPT, Llama3.3-70B-Instruct, CompliantGPT, HuatuoGPT, and BastionGPT ranked lower in multiple dimensions, with a significant gap relative to first-tier models.
[CONCLUSION] This study demonstrated LLMs' potential as medical consultation tools for double eyelid surgery, providing useful guidance for both English and Chinese users. Future research should focus on fine-tuning LLMs with more specialized medical data and exploring workflows for surgeon-LLM collaboration to validate their clinical utility.
[LEVEL OF EVIDENCE V] This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .
[METHODS] This study collected sixteen questions commonly of concern to individuals seeking the surgery via an online questionnaire and assessed the efficacy of fifteen popular LLMs in answering these questions with both English and Chinese inputs. All responses from the LLMs were scored multidimensionally by three expert eyelid plastic surgeons across dimensions including professionalism, patient friendliness, informativeness, practicality, and logical clarity. The scoring results were statistically analyzed using the Friedman test and Nemenyi post-hoc test.
[RESULTS] With English input, ERNIE-Bot, ChatGPT-4o, and Gemini-2.0-Flash consistently ranked among the top three across most evaluation dimensions. In contrast, Claude-3.7-Sonnet, HuatuoGPT, ZoeGPT, CompliantGPT, and BastionGPT ranked lower across all dimensions, with performance significantly lagging behind the top performers. For Chinese input, DeepSeek-R1 maintained a leading position across all dimensions, forming the first tier alongside DeepSeek-V3, Gemini-2.0-Flash, and ERNIE-Bot. Meanwhile, Claude-3.5-Haiku, ZoeGPT, Llama3.3-70B-Instruct, CompliantGPT, HuatuoGPT, and BastionGPT ranked lower in multiple dimensions, with a significant gap relative to first-tier models.
[CONCLUSION] This study demonstrated LLMs' potential as medical consultation tools for double eyelid surgery, providing useful guidance for both English and Chinese users. Future research should focus on fine-tuning LLMs with more specialized medical data and exploring workflows for surgeon-LLM collaboration to validate their clinical utility.
[LEVEL OF EVIDENCE V] This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .
추출된 의학 개체 (NER)
| 유형 | 영어 표현 | 한국어 / 풀이 | UMLS CUI | 출처 | 등장 |
|---|---|---|---|---|---|
| 해부 | eyelid
|
눈꺼풀 | dict | 4 | |
| 시술 | double eyelid
|
안검성형술 | dict | 3 | |
| 해부 | upper eyelid
|
눈꺼풀 | dict | 1 | |
| 해부 | crease
|
scispacy | 1 | ||
| 해부 | ERNIE-Bot
|
scispacy | 1 | ||
| 합병증 | eyelid plastic
|
scispacy | 1 | ||
| 약물 | [BACKGROUND] Double
|
scispacy | 1 | ||
| 약물 | BastionGPT
|
scispacy | 1 | ||
| 질환 | Language
|
scispacy | 1 | ||
| 기타 | patient
|
scispacy | 1 | ||
| 기타 | CompliantGPT
|
scispacy | 1 |
MeSH Terms
Humans; Blepharoplasty; Language; Referral and Consultation; Surveys and Questionnaires; Female; Eyelids; Male; China; Adult; Middle Aged; Surgery, Plastic; Large Language Models; East Asian People
📑 인용 관계
이 논문이 참조한 문헌 28
- Exploring the Potential of ChatGPT-4 in Responding to Common Questions About Abdominoplasty: An AI-B…
- Factors Affecting Patient Satisfaction with Double-Eyelid Blepharoplasty.
- Asian Upper Blepharoplasty: A Comprehensive Approach.
- Review of complications in double eyelid surgery.
- Complications of Asian Double Eyelid Surgery: Prevention and Management.
- Double eyelid surgery by using palpebral marginal incision technique in Asians.
- The Evolution of Looks and Expectations of Asian Eyelid and Eye Appearance.
외부 PMID 21건 (DB 미수집)
- PMID 22655191 ↗
- PMID 30788521 ↗
- PMID 31836106 ↗
- PMID 36753318 ↗
- PMID 36812645 ↗
- PMID 37191485 ↗
- PMID 37264670 ↗
- PMID 37460753 ↗
- PMID 37528548 ↗
- PMID 37768724 ↗
- PMID 37999899 ↗
- PMID 38100393 ↗
- PMID 38606229 ↗
- PMID 38875575 ↗
- PMID 38888919 ↗
- PMID 39115930 ↗
- PMID 39504445 ↗
- PMID 39819381 ↗
- PMID 39919278 ↗
- PMID 40234701 ↗
- PMID 40424585 ↗
🔗 함께 등장하는 도메인
이 논문이 속한 카테고리와 같은 논문에서 자주 함께 다뤄지는 카테고리들
관련 논문
- Penetrating globe injury following periocular hyaluronic acid filler injection: A case report.
- Implications of Dermatologic Disorders in Facial Cosmetic Surgery: A Systematic Review.
- Mohs Surgery Defect Closure Using Blepharoplasty.
- Application of the SCIA-Pure Skin Perforator Flap in Bilateral Upper Eyelid Reconstruction: A Case Report and Review of the Literature.
- Combined minimally invasive lymphatic microsurgery and aligned nanofibrillar collagen scaffold for refractory post-traumatic eyelid lymphedema: A case report.