Evaluation of the Appropriateness and Readability of ChatGPT Responses to Patient Queries on Uveitis

Evaluation of the Appropriateness and Readability of ChatGPT Responses to Patient Queries on Uveitis

Presenter:

Saeed Mohammadi

Authors:

S. Saeed Mohammadi1, Anadi Khatri1,2,3, Tanya Jain1,4, Zheng Xian Thng1,5, Woong-sun Yoo1,6, Negin Yavari1, Vahid Bazojoo1, Azadeh Mobasserian1, Amir Akhavanrezayat1, Quan Dong Nguyen1

1. Byers Eye Institute, Department of Ophthalmology, Stanford University, Palo Alto, California, United States

2. Birat Aankha Aspatal, Biratnagar, Nepal

3. Department of Ophthalmology, Birat Medical College and Teaching Hospital, Kathmandu University, Biratnagar, Nepal

4. Dr. Shroff Charity Eye Hospital, New Delhi, India

5. National Healthgroup Eye Institute, Tan Tock Seng Hospital, Singapore

6. Gyeongsang National University Hospital, Jinju, S. Korea

Affiliation:

Purpose: To compare the utility of ChatGPT as an online uveitis patient education resource with established web-based patient education platforms.

Methods: The top 8 uveitis patient education websites indexed by Google as of November 2023 were included in the study. Information regarding uveitis were compiled from Healthline, Mayo Clinic, WebMD, National Eye Institute, Ocular Uveitis and Immunology Foundation, American Academy of Ophthalmology, Cleveland Clinic, and National Health Service websites. The same queries from these websites were posed to ChatGPT 4.0 three times and responses were recorded. The process was repeated for another 3 instances with the inclusion of the following request after each query 'Please provide a response suitable for the average American adult, at a 6th grade comprehension level.’ to mimic a simplified ChatGPT response. Three vitreoretinal specialists, all masked to the sources, graded the content in terms of personal preference, comprehensiveness, and accuracy. Additionally, six readability indices including Flesch Reading Ease, Flesch-Kincaid Grade Level, Gunning Fog Index, Coleman-Liau Index, Simple Measure of Gobbledygook, and FORCAST grade index were calculated using an online calculator, Readable.com, to assess the ease of comprehension of each answer.

Results: A total of 497 responses, comprising 71 from existing websites, 213 standard responses from ChatGPT, and 213 simplified responses from ChatGPT were recorded and graded. Standard ChatGPT responses were preferred and perceived to be more comprehensive by trained specialist ophthalmologists while maintaining similar accuracy level compared to existing websites. Moreover, simplified ChatGPT responses matched almost all existing websites in terms of personal preference, accuracy, and comprehensiveness (Figure 1). Notably, almost all readability indices suggested that standard ChatGPT responses demand a higher educational level for comprehension, whereas simplified responses required lower level of education compared to existing websites (Figure 2).

Conclusion: With the advent of technology and the Internet, patients are increasingly seeking information online. This study shows ChatGPT provides an avenue for patients to access comprehensive and accurate disease-related information tailored to their educational level.

Back to Abstracts