AI outperforms doctors in Harvard trial of emergency triage diagnoses

A picture


From George Clooney in ER to Noah Wyle in The Pitt, emergency department doctors have long been popular heroes,But will it soon be time to hang up the scrubs?A groundbreaking Harvard study has found that AI systems outperformed human doctors in high-pressure emergency medicine triage, diagnosing more accurately in the potentially life and death moments when people are first rushed to hospital,The results were described by independent experts as showing “a genuine step forward” in the clinical reasoning of AIs and came as part of trials that tested the responses of hundreds of doctors against an AI,The authors said the results, published in the journal Science, showed large language models (LLMs) “have eclipsed most benchmarks of clinical reasoning”,One experiment focused on 76 patients who arrived at the emergency room of a Boston hospital.

An AI and a pair of human doctors were each given the same standard electronic health record to read – typically including vital sign data, demographic information and a few sentences from a nurse about why the patient was there.The AI identified the exact or very close diagnosis in 67% of cases, beating the human doctors, who were right only 50%-55% of the time.It showed the AIs’ advantage was particularly pronounced in triage circumstances requiring rapid decisions with minimal information.The diagnosis accuracy of the AI – OpenAI’s o1 reasoning model – rose to 82% when more detail was available, compared with the 70-79% accuracy achieved by the expert humans, though this difference was not statistically significant.It also outperformed a larger cohort of human doctors when asked to provide longer term treatment plans, such as providing antibiotics regimes or planning end-of-life processes.

The AI and 46 doctors were asked to examine five clinical case studies and the computer made significantly better plans, scoring 89% compared with 34% for humans using conventional resources, such as search engines,But it is not curtains for emergency doctors yet, the researchers said,The study only tested humans against AIs looking at patient data that can be communicated via text,The AI’s reading of signals, such as the patient’s level of distress and their visual appearance, were not tested,That means the AI was performing more like a clinician producing a second opinion based on paperwork.

“I don’t think our findings mean that AI replaces doctors,” said Arjun Manrai, one of the lead authors of the study who heads an AI lab at Harvard Medical School,“I think it does mean that we’re witnessing a really profound change in technology that will reshape medicine,”Dr Adam Rodman, another lead author and a doctor at Boston’s Beth Israel Deaconess medical centre where the study took place, said AI LLMs were among “the most impactful technologies in decades”,Over the next decade, he said, AI would not replace physicians but join them in a new “triadic care model … the doctor, the patient, and an artificial intelligence system”,In one case in the Harvard study, a patient presented with a blood clot to the lungs and worsening symptoms.

Human doctors thought the anti-coagulants were failing, but the AI noticed something the humans did not: the patient’s history of lupus meant this might be causing the inflammation of the lungs,The AI was proved correct,Nearly one in five US physicians are already using AI to assist diagnosis, according to research published last month,In the UK, 16% of doctors are using the tech daily and a further 15% weekly, with “clinical decision-making” being one of the most common uses, according to a recent Royal College of Physicians survey,The UK doctors’ biggest concerns were AI error and liability risks.

Billions are being invested in AI healthcare companies, but questions remain about the consequences of AI error,“There is not a formal framework right now for accountability,” said Rodman, who also stressed patients ultimately “want humans to guide them through life or death decisions [and] to guide them through challenging treatment decisions”,Prof Ewen Harrison, co-director of the University of Edinburgh’s centre for medical informatics, said the study was important and showed that “these systems are no longer just passing medical exams or solving artificial test cases,They are starting to look like useful second-opinion tools for clinicians, particularly when it is important to consider a wider range of possible diagnoses and avoid missing something important,”Dr Wei Xing, an assistant professor at the University of Sheffield’s school of mathematical and physical sciences, said some of the other findings suggested doctors may unconsciously defer to the AI’s answer rather than thinking independently.

“This tendency could grow more significant as AI becomes more routinely used in clinical settings,” he said.He also highlighted the lack of information about which patients the AI was worse at diagnosing and whether it struggled more with elderly patients or non-English speakers.He said: “It does not demonstrate that AI is safe for routine clinical use, nor that the public should turn to freely available AI tools as a substitute for medical advice.”
societySee all
A picture

Why routine cancer tests have age limits | Brief letters

Jane Ghosh asks why the NHS’s routine screening for bowel and breast cancer has upper age limits (Letters, 28 April). Screening – testing because of risk, not symptoms – stops when the chance of helping you drops below the chance of harming you. Diagnostic testing is done at any age.Dr John Doherty Stratford-upon-Avon, Warwickshire Re Jane Ghosh’s letter about the NHS stopping routine bowel and breast cancer testing after the early 70s, it’s important to know that people over the age thresholds can request a bowel cancer test every two years or breast cancer screening every three years. Remembering to do so is a different story

A picture

UK researchers develop tool to identify people most at risk of obesity-related diseases

A new tool that can shed light on who is most at risk of obesity-related diseases could help identify people who would benefit most from weight-loss medications, researchers have said.Recent data suggests about two-thirds of adults in England are overweight or obese – a situation that has caused concern among health experts.Now researchers have developed a tool that, they say, offers an accurate and personalised approach to identifying those at risk of obesity-related conditions.They add it could be useful for prioritising who should receive interventions, such as weight-loss jabs, given that access on the NHS is limited and currently based simply on having a high body mass index (BMI) and particular obesity-related health problems.Prof Nick Wareham, of the University of Cambridge, a co-author of the study, said the measure was not about extending the use of particular therapies

A picture

Raise tax on alcohol and junk food to cut deaths from liver disease, experts say

Governments in Europe should impose much higher taxes on alcohol and unhealthy food to tackle the continent’s 284,000 deaths a year from liver disease, experts say.Taxes on those products should rise sharply enough for the money raised to cover the huge costs they place on health services, the criminal justice system and social services.The call for tough action on common causes of serious liver disease comes from a commission of experts from the European Association for the Study of the Liver and the Lancet medical journal.They are urging governments in Europe to ensure all alcoholic products carry health warnings and stop under-18s being targeted with online advertisements for alcoholic drinks and junk food.Bold steps are needed to combat “an escalating and unsustainable burden of liver disease”, the commission says in a report published on Wednesday in the Lancet

A picture

Trial of non-invasive endometriosis scan boosts hopes for quicker diagnosis

A non-invasive scan for endometriosis has shown promising results in a trial, boosting hopes for far quicker diagnosis.The trial, which included 19 women with the condition, suggests that an experimental radiotracer, called maraciclatide, can “light up” endometriosis on a scan. The current need for a surgical investigation is seen as a major obstacle to timely diagnosis, with women in England typically waiting nearly a decade.Prof Krina Zondervan, head of department at the Nuffield Department of Women’s and Reproductive Health (NDWRH) at the University of Oxford, and co-lead on the study, said: “The most prevalent subtype of endometriosis currently evades reliable detection, leaving women no choice for diagnosis other than invasive surgery. If these results are confirmed in larger phase 3 studies, imaging with maraciclatide could transform clinical research and practice and potentially empower the development of treatments for women across the globe

A picture

Leasehold ban in England and Wales unlikely before next general election, minister says

A ban on new leasehold properties in England and Wales is unlikely to come into force until after the next election, the housing minister has said, as he defended the government’s piecemeal attempts to dismantle the system.The long-promised end would take years to “switch on”, Matthew Pennycook said, even though the ban of leaseholds on new houses was passed in 2024 and the government intends to pass one on new flats soon.Pennycook was giving a speech defending the government’s approach to bringing a de facto end to the feudal-era system after years of complaints from leaseholders about crippling service charges and crumbling buildings. He said the process needed to be rolled out slowly to avoid undermining housing supply and falling into legal pitfalls.“I think it’s highly likely that we don’t switch on the ban in this parliament,” he told reporters afterwards

A picture

The use of advanced practitioners in the NHS is no reason to fear for patient safety | Letters

I am an advanced clinical practitioner in acute respiratory medicine, and the British Medical Association’s (BMA) characterisation of practitioners like me as unsafe “substitute doctors” demands a response (Safety fears as UK hospitals use nurses to cover for doctors due to shortage of medics, 25 April).Every shift, I assess and manage patients with severe chronic obstructive pulmonary disease exacerbations, pulmonary embolisms, pneumonia and acute respiratory failure, taking clinical responsibility in a consultant-led multidisciplinary team, underpinned by a master’s-level qualification and over a decade of specialist experience. This is not doctor substitution. This is advanced practice: a distinct, evidence-based clinical role that enhances patient care rather than compromising it.The cases cited in your article (at Rotherham general hospital and a GP practice) represent failures of organisational governance, not evidence that advanced practitioners are inherently unsafe