← 뒤로

Evaluation of Artificial Intelligence Chatbots for Facial Injection Planning: Comparative Performance and Safety Limitations.

Aesthetic plastic surgery 2025 Vol.49(21) p. 5866-5876

Radulesco T, Ebode D, Maniaci A, Gargula S, Saibene AM, Chiesa-Estomba C, Gengler I, Vaira L, Vishnumurthy P, Lechien JR, Michel J

원문 ↗ DOI ↗

Abstract

[BACKGROUND] To evaluate the performance of artificial intelligence (AI)-powered chatbots in generating treatment plans for facial aesthetic injections, focusing on their accuracy, safety, and clinical applicability.

[METHODS] A comparative observational study was conducted in an otolaryngology tertiary care department according to STROBE guidelines. Patients seeking facial injections were recruited from July to October 2024. Forty patients (85% female; mean age: 45.8 years) underwent photographic documentation and received AI-generated treatment plans for botulinum toxin and hyaluronic acid injections. Six AI chatbots and three generative vision models were evaluated based on five criteria: product selection, injection strategy, facial analysis, alignment with patient preferences, and safety. Likert scale ratings, each ranging from - 2 to + 2, were analyzed using Friedman and Durbin-Conover pairwise tests to identify significant differences (p < 0.05). The sum of the five Likert scales provided an overall score ranging from - 10 to + 10.

[RESULTS] ChatGPTo1 and ChatGPT4o achieved higher scores than other chatbots across most evaluation criteria, with mean total scores of 7.87 ± 0.29 and 7.85 ± 0.44, respectively (p = 0.295). Both chatbots were statistically superior (p < 0.05) to Claude, CopilotPro, and Llama in product selection (ChatGPT4o = 1.92 ± 0.05), injection strategy precision (ChatGPTo1 = 1.67 ± 0.08), alignment with patient preferences (ChatGPTo1 = 1.95 ± 0.03) and safety (ChatGPTo1 = 1.30 ± 0.17). Claude provided relevant facial analysis (1.50 ± 0.16) without significant difference compared to ChatGPT models (all p > 0.05). Generative vision models failed to produce relevant visual annotations.

[CONCLUSION] Among the AI systems tested, ChatGPT-based chatbots demonstrated relatively superior performance in generating treatment plans for facial injections. However, safety limitations remain and preclude unsupervised clinical use.

[LEVEL OF EVIDENCE IV] This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .

추출된 의학 개체 (NER)

유형	영어 표현	한국어 / 풀이	출처	등장
시술	`botulinum toxin`	보툴리눔독소 주사	dict	1
재료	`hyaluronic acid`	히알루론산	dict	1
약물	`[BACKGROUND]`		scispacy	1
약물	`[RESULTS] ChatGPTo1`		scispacy	1
약물	`ChatGPT4o`		scispacy	1
약물	`ChatGPT`		scispacy	1
기타	`STROBE`		scispacy	1
기타	`Patients`		scispacy	1
기타	`female`		scispacy	1
기타	`patient`		scispacy	1
기타	`Llama`		scispacy	1
기타	`ChatGPT4o`		scispacy	1
기타	`ChatGPTo1`		scispacy	1

MeSH Terms

Humans; Female; Middle Aged; Male; Artificial Intelligence; Hyaluronic Acid; Adult; Cosmetic Techniques; Face; Skin Aging; Dermal Fillers; Patient Care Planning; Generative Artificial Intelligence

🔗 함께 등장하는 도메인

이 논문이 속한 카테고리와 같은 논문에서 자주 함께 다뤄지는 카테고리들

유방 (22) 안검성형술 (22) 코성형술 (21) 안면거상술 (21) 지방흡입 (19) 감염 (17) 유방성형술 (13) 피하조직 (12)

Evaluation of Artificial Intelligence Chatbots for Facial Injection Planning: Comparative Performance and Safety Limitations.

관련 도메인

Abstract

추출된 의학 개체 (NER)

MeSH Terms

🔗 함께 등장하는 도메인

관련 논문