Radiologists Can't Tell Real X-Rays From AI Fakes, and Hospitals Are Vulnerable

Q: Why Can't Experts Spot AI-Generated Medical Images?

The research team, led by Dr. Mickael Tordjman, tested radiologists against two distinct datasets. The first contained 154 static X-rays, half real and half generated by OpenAI's ChatGPT-4o. The second used a specialized AI model called RoentGen, trained specifically to create believable chest X-rays with visible organs like the heart and lungs. Individual radiologist performance ranged from 58% to 92% accuracy on ChatGPT-generated images and 62% to 78% on RoentGen images, with no clear correlation between experience level and detection ability . The reason these fakes fool experts reveals something unsettling about how AI generates medical images. Tordjman noted that deepfake X-rays often look suspiciously perfect: bones are overly smooth, spines unnaturally straight, lungs overly symmetrical, blood vessel patterns excessively uniform, and fractures appear unusually clean and consistent. In other words, the AI removes the messy reality of actual human anatomy, creating images that are technically flawless but subtly wrong in ways that bypass human pattern recognition . Even large language models struggled with the task. ChatGPT-4o, the very model used to generate some of the fake images, achieved only 57% accuracy identifying its own creations. Google's Gemini 2.5 Pro, Meta's Llama 4 Maverick, and ChatGPT-5 ranged from 57% to 89% accuracy across the datasets .

Q: What Are the Real-World Risks of Deepfake Medical Evidence?

The implications extend far beyond academic concern. Tordjman identified three distinct threat vectors. First, fraudulent litigation becomes viable if a fabricated fracture or other injury is indistinguishable from a real one. Legal experts are already grappling with how to protect juries from exposure to AI forgeries that could taint cases. Second, there is significant cybersecurity risk if hackers gain access to a hospital's network and inject synthetic images to manipulate patient diagnoses or cause widespread clinical chaos. Third, insurance fraud and workers' compensation scams become easier to execute when medical evidence can be fabricated convincingly . The vulnerability is particularly acute because radiologists are the most highly trained medical image specialists on the planet. If they cannot reliably distinguish real from fake, the barrier to entry for medical fraud has collapsed. A bad actor no longer needs access to actual patient records or imaging equipment; they need only a generative AI model and basic knowledge of how to insert images into medical systems . The study, published in the journal Radiology on Tuesday, offers a sobering reminder that expertise alone is insufficient defense against generative AI. One interesting finding emerged: musculoskeletal radiologists, who specialize in bones and joints, proved significantly better at spotting fakes than other subspecialists. This suggests that domain-specific training and familiarity with the subtle variations in normal anatomy may offer some protection, but the overall accuracy rates remain too low to rely on human detection alone . "Our study demonstrates that these deepfake X-rays are realistic enough to deceive radiologists, the most highly trained medical image specialists, even when they were aware that AI-generated images were present," stated Dr. Mickael Tordjman, lead author and post-doctoral fellow at the Icahn School of Medicine. The Mount Sinai research arrives at a critical moment. A

FrontierNews.ai AI Research Desk

FrontierNews.ai