Choosing a Model When Hallucinations Can Cause Harm: A FACTS Benchmark Case Study
https://direct-wiki.win/index.php/When_a_Hospital%27s_Triage_Assistant_Gave_Dangerous_Advice:_Dr._Lin%27s_Story
How a Healthcare SaaS Team Used FACTS to Decide Between GPT-4 and Open-Weight Models In April–May 2024 a 65-person healthcare SaaS company confronted a production decision: which language model to deploy inside a clinical decision-support