AI instructor gets high marks in guiding bystanders through CPR

14 giờ trước
Jairia Dela Cruz
Jairia Dela CruzSenior Medical Writer; MIMS
Jairia Dela Cruz
Jairia Dela Cruz Senior Medical Writer; MIMS
AI instructor gets high marks in guiding bystanders through CPR

An artificial intelligence (AI)-enabled cardiopulmonary resuscitation (CPR) instructor has shown promise for enhancing bystander emergency response in a proof-of-concept study, outperforming human dispatchers in delivering guideline-concordant CPR instruction.

The open-source, AI-powered, text-based CPR instructor ChatCPR was grounded in 911 dispatcher training materials and CPR best practices. This tool could serve as a supplemental approach to bystander CPR support, according to the investigators.

“Approximately 350,000 out-of-hospital cardiac arrests (OHCAs) occur annually in the US, with a survival rate of approximately 9 percent. Bystander-initiated CPR increases survival, but only 41.7 percent of individuals experiencing OHCA receive it, likely due to inadequate training and support to intervene,” they said.

“By providing accurate instructional coaching on demand, AI-enabled solutions may help address gaps in bystander readiness during OHCA,” they added.

Current AI models not good enough

ChatCPR was developed using findings from a benchmark testing of six widely available AI models—ChatGPT, Claude, Gemini, Grok, Llama, and Mistral—to assess their instruction quality on CPR coaching in simulated emergency scenarios.

“Our goal was to take a first step to understand how these tools perform and how they should be evaluated before being used in patient-facing settings,” said senior study author Dr Christopher Horvat from the University of Pittsburgh, Pittsburgh, Pennsylvania, US, in a press statement.

The benchmark test involved four scenarios involving confirmed cardiac arrest across a range of ages (from toddlers to seniors) and situations (eg, drowning, collapsing while jogging) where CPR instruction would be delivered differently, as well as one scenario representing a non-cardiac arrest emergency to assess the ability of models to appropriately withhold CPR instructions.

Performance on CPR coaching was evaluated against a checklist of 27 yes-or-no items categorized into two criteria. First was the minimally viable criteria, which covered core actions required for effective bystander CPR and included instructions such as performing appropriate-depth chest compressions. The other was the maximally effective criteria, which represented satisfaction of all checklist items to optimize CPR quality and safety, including more nuanced instructions such as ensuring that compressions achieved full recoil—all derived from major association CPR guidelines.

Overall, AI models performed well on the basics of CPR, achieving 89.7 percent of minimally viable criteria. Scores ranged from 79.4 percent for Gemini to 97.1 percent for Grok and Claude. However, the performance dropped when it came to giving more advanced instructions, achieving 69.8 percent of maximally effective criteria. Scores ranged from 61.3 percent for Llama to 75 percent for GPT-4o. [JAMA Intern Med 2026;doi:10.1001/jamainternmed.2026.1552]

ChatCPR bests human dispatchers

For the same scenarios used in the benchmark testing, ChatCPR achieved 100 percent adherence to both minimally viable and maximally effective criteria. The tool outperformed the best-performing baseline AI model by a factor of 1.4 for maximally effective checklist criteria, providing instructions to support automated external defibrillator (AED) use, positioning and appropriate recoil, and consistent initial patient assessment. 

Furthermore, in a head-to-head comparison with human dispatchers, ChatCPR demonstrated superior performance.

ChatCPR achieved 100 percent of minimally viable criteria and 98.9 of maximally effective criteria, whereas dispatchers achieved 84.5 percent and 62.8 percent, respectively. This translated to a relative performance difference of 1.2 for minimally viable criteria and 1.6 for maximally viable criteria.

“Performance differences were most pronounced for minimally viable criteria related to assessing whether the patient was awake or responding, providing initial chest compression instructions, and instructing compression quality (depth and rate). For maximally effective criteria, the largest gaps were in directing the caller to retrieve and use an AED if available, instructing full chest recoil between compressions, and ensuring proper continual CPR positioning,” Horvat and colleagues noted in their paper.

Emergency care workflow support

“Our findings support further prospective evaluation of AI-enabled CPR instructions as a scalable intervention to improve resuscitation care and assess its potential effects on outcomes after OHCA in clinical trials. Such tools can be deployed through widely available platforms, including smartphones, search engines, and voice assistants, enabling immediate access to standardized instructions,” according to Horvat and colleagues.

“Unlike human dispatchers, AI-based system fidelity is not subject to fatigue, stress, cognitive overload, overemphasis on certain codewords or phrases or other human factors that can introduce variability in time-critical emergency communication. In addition, multilingual capabilities and independence from human staffing constraints could extend reliable CPR instructions to settings where dispatcher systems are limited or unavailable,” they added.

Horvat and colleagues emphasized that ChatCPR is not intended to replace human responders. The tool was designed to function as an adjunct across multiple points in the emergency care workflow.

“ChatCPR may support dispatchers by reinforcing standardized, guideline-based instructions and assisting with complex conditional guidance, such as paediatric-specific modifications, analogous to established clinical decision support systems. AI-enabled CPR instructions may similarly support health care professionals, emergency medical services, and other first responders by promoting resuscitation practices that are easily adaptable,” they pointed out.

Next steps in research

In an accompanying editorial, Drs Teva Brender, Sharon Inouye, and Cary Gross, editorial fellow, editor in chief, and associate editor at JAMA Internal Medicine, respectively, noted that the proof-of-concept study pushes “the field forward by demonstrating that an AI tool can deliver high-quality, guideline-concordant CPR instructions in the targeted situation of transcribed 911 calls.” [JAMA Intern Med 2026;doi:10.1001/jamainternmed.2026.1559]

However, Brender, Inouye, and Gross pointed out that an AI CPR instructor should have speech-to-speech functionality and be able to handle multiple languages, ambient noise, and situations with limited connectivity for it to be practical for clinical use.

In actual clinical care, “use of an AI CPR instructor … should focus on strategies that enhance and augment the role of the emergency dispatcher, the essential ‘human in the loop,’ by helping with performance of routine, structured tasks. This approach would enable dispatchers to provide the highest-quality recommendations for protocolized care, such as CPR, while retaining control over complex decision-making that incorporates contextual understanding, judgment under uncertainty, ethical reasoning, creativity, and flexibility,” they said.