NEW DELHI: With chatbots increasingly being relied upon for making sense of one’s symptoms or test results, a study has shown that AI tools may not fare so well in conversations closer to real-world interactions, even as they perform well on medical exam-like tests.

The study, published in the journal Nature Medicine, also proposes recommendations for evaluating large-language models (LLM) — they power chatbots such as ChatGPT — before using them in clinical settings. LLMs are trained on massive text datasets and thus, can respond to a user’s requests in the natural language.

Researchers at Harvard Medical School and Stanford University, US, designed a framework ‘CRAFT-MD’ to evaluate four LLMs, including GPT-4 and Mistral, for how well they performed in settings closely mimicking actual interactions with patients.

Source: PTI

 

Leave a Reply

Your email address will not be published. Required fields are marked *