Radiologists trained to spot the subtlest abnormalities in medical imaging are failing to distinguish real X-rays from AI-generated forgeries at rates that should alarm hospitals and patients alike. A new study from Mount Sinai's Icahn School of Medicine tested 17 practicing radiologists from six countries and found they could identify AI-generated X-rays with only 41% accuracy when unaware the test contained fakes. Even when told to expect synthetic images, their accuracy improved to just 75%. Why Can't Experts Spot AI-Generated Medical Images? The research team, led by Dr. Mickael Tordjman, tested radiologists against two distinct datasets. The first contained 154 static X-rays, half real and half generated by OpenAI's ChatGPT-4o. The second used a specialized AI model called RoentGen, trained specifically to create believable chest X-rays with visible organs like the heart and lungs. Individual radiologist performance ranged from 58% to 92% accuracy on ChatGPT-generated images and 62% to 78% on RoentGen images, with no clear correlation between experience level and detection ability. The reason these fakes fool experts reveals something unsettling about how AI generates medical images. Tordjman noted that deepfake X-rays often look suspiciously perfect: bones are overly smooth, spines unnaturally straight, lungs overly symmetrical, blood vessel patterns excessively uniform, and fractures appear unusually clean and consistent. In other words, the AI removes the messy reality of actual human anatomy, creating images that are technically flawless but subtly wrong in ways that bypass human pattern recognition. Even large language models struggled with the task. ChatGPT-4o, the very model used to generate some of the fake images, achieved only 57% accuracy identifying its own creations. Google's Gemini 2.5 Pro, Meta's Llama 4 Maverick, and ChatGPT-5 ranged from 57% to 89% accuracy across the datasets. What Are the Real-World Risks of Deepfake Medical Evidence? The implications extend far beyond academic concern. Tordjman identified three distinct threat vectors. First, fraudulent litigation becomes viable if a fabricated fracture or other injury is indistinguishable from a real one. Legal experts are already grappling with how to protect juries from exposure to AI forgeries that could taint cases. Second, there is significant cybersecurity risk if hackers gain access to a hospital's network and inject synthetic images to manipulate patient diagnoses or cause widespread clinical chaos. Third, insurance fraud and workers' compensation scams become easier to execute when medical evidence can be fabricated convincingly. The vulnerability is particularly acute because radiologists are the most highly trained medical image specialists on the planet. If they cannot reliably distinguish real from fake, the barrier to entry for medical fraud has collapsed. A bad actor no longer needs access to actual patient records or imaging equipment; they need only a generative AI model and basic knowledge of how to insert images into medical systems. How to Strengthen Medical Imaging Defenses Against AI Forgeries - Implement Detection Tools: Tordjman's team is working to establish educational datasets and specialized detection tools designed to identify the telltale signs of AI-generated medical images, such as overly smooth bone surfaces and unnaturally symmetrical organs. - Develop Training Datasets: Medical institutions should create internal training programs using realistic deepfake examples so radiologists can build pattern recognition skills specific to AI-generated artifacts before encountering them in clinical practice. - Strengthen Network Security: Hospitals must implement robust access controls, audit logs, and image verification protocols to detect unauthorized injection of synthetic images into patient records or PACS (Picture Archiving and Communication Systems). - Establish Verification Protocols: Cross-reference imaging results with patient history, clinical symptoms, and multiple imaging modalities to catch inconsistencies that might indicate synthetic evidence. The study, published in the journal Radiology on Tuesday, offers a sobering reminder that expertise alone is insufficient defense against generative AI. One interesting finding emerged: musculoskeletal radiologists, who specialize in bones and joints, proved significantly better at spotting fakes than other subspecialists. This suggests that domain-specific training and familiarity with the subtle variations in normal anatomy may offer some protection, but the overall accuracy rates remain too low to rely on human detection alone. "Our study demonstrates that these deepfake X-rays are realistic enough to deceive radiologists, the most highly trained medical image specialists, even when they were aware that AI-generated images were present," stated Dr. Mickael Tordjman, lead author and post-doctoral fellow at the Icahn School of Medicine. Dr. Mickael Tordjman, Post-Doctoral Fellow at Icahn School of Medicine, Mount Sinai The Mount Sinai research arrives at a critical moment. As generative AI tools become more accessible and capable, the attack surface for medical fraud expands. Hospitals, insurers, legal systems, and patients all face new risks that existing safeguards were never designed to address. The path forward requires a combination of technical innovation, institutional vigilance, and a fundamental shift in how medical institutions approach image verification and network security.