119
AI chatbots fail medical misinformation test, returning inaccurate and fabricated advice
(www.psypost.org)
"We did it, Patrick! We made a technological breakthrough!"
A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.
AI, in this case, refers to LLMs, GPT technology, and anything listed as "AI" meant to increase market valuations.
Couldn't the researchers at least bother to use the latest models?
The study was done in Feb 2025 and they probably wrote the research proposal months before then, waited for approval / funding, etc. I don't know the process of how academia works but I imagine it to be very slow and bureaucratic.
https://bmjopen.bmj.com/content/16/4/e112695
Well they didn't even use the latest models in Feb 2025. They should've used DeepSeek R1 and OpenAI o3-mini which use additional test time compute to arrive at better answers. They used GPT 3.5 which was about 2½ years old at the time.