OpenAI's Safety Fellowship Masks a Deeper Problem: Internal Teams Dismantled While External Research Gets Funded
OpenAI announced a new Safety Fellowship offering external researchers $3,850 weekly stipends plus compute resources to study artificial intelligence risks, but the timing raised eyebrows when a major investigation revealed the company had dismantled three consecutive internal safety teams over the previous 22 months. The contrast between funding independent researchers and dissolving internal safety infrastructure highlights a fundamental tension in how the company approaches existential AI risk.
What Happened to OpenAI's Internal Safety Teams?
According to a New Yorker investigation published on April 6, 2026, the same day as the Safety Fellowship announcement, OpenAI eliminated three dedicated safety teams. The superalignment team was shut down in May 2024 following the departure of co-leads Ilya Sutskever and Jan Leike. Leike publicly stated upon leaving that safety culture and processes had taken a backseat to shiny products. The AGI Readiness team was dissolved in October 2024, and the Mission Alignment team was disbanded in February 2026 after just 16 months of existence.
The investigation also documented a striking moment when a journalist asked OpenAI representatives to speak with the company's existential safety researchers. The response: "What do you mean by existential safety? That is not, like, a thing." This statement from a company that helped popularize debate around existential AI risks signals a significant shift in organizational priorities.
How Does the Safety Fellowship Actually Work?
The Safety Fellowship runs from September 14, 2026 to February 5, 2027, with applications closing May 3 and notifications expected by July 25. The program structure includes several key components designed to support independent research:
- Compensation Package: Fellows receive $3,850 weekly, totaling over $200,000 annualized, plus approximately $15,000 per month in compute resources and direct mentorship from OpenAI researchers
- Research Autonomy: Participants can define their own research direction within seven priority areas, including safety evaluation, ethics, robustness, scalable mitigations, privacy-preserving safety methods, autonomous agent oversight, and high-severity misuse domains
- Work Flexibility: Fellows can work in person at OpenAI's Constellation space in Berkeley or remotely, though they do not have access to OpenAI's internal systems, proprietary data, or model training processes
- Recruitment Focus: OpenAI actively recruits professionals from cybersecurity, social sciences, human-computer interaction, and computer science, prioritizing research ability and execution capability over specific academic credentials
The program explicitly positions itself as external research funding rather than a replacement for internal safety infrastructure. Fellows receive API credits and compute power but cannot directly audit model training, evaluate proprietary datasets, or influence development decisions in real time.
Why the Timing Matters for AI Risk Assessment
The simultaneous announcement of the Safety Fellowship and revelation of dissolved internal teams creates a credibility problem for OpenAI's safety narrative. Documents filed with government agencies like the Internal Revenue Service reflect deliberate organizational decisions, not bureaucratic oversights. OpenAI removed the word "safely" from its official mission statement filed with the IRS, a symbolic change that carries considerable weight.
External research funding, while valuable, cannot replicate the operational safety infrastructure that internal teams provide. Independent researchers cannot influence real-time development decisions, audit training processes, or access the proprietary systems where safety risks actually emerge. The Safety Fellowship functions more like an academic research funding program than an operational safety structure.
The broader context matters here. As AI systems become more capable and autonomous, the gap between external evaluation and internal safety oversight grows more consequential. Researchers studying AI risks from outside a company's walls cannot catch problems during development; they can only analyze systems after they exist.
What Are the Seven Research Priority Areas?
OpenAI defined a broad research agenda for Safety Fellows, covering most topics the scientific community considers urgent for AI safety. By February 2027, each fellow must deliver substantive output such as a scientific paper, benchmark, or dataset. The priority areas include:
- Safety Evaluation: Methods for testing and measuring how safe AI models are across different use scenarios and deployment contexts
- Ethics Research: Investigation into the moral and social implications of developing and deploying advanced AI systems
- Robustness Studies: Research on making models more resistant to failures, adversarial attacks, and unexpected behaviors in real-world conditions
- Scalable Mitigations: Development of safety solutions that can keep pace with model growth as systems become more capable
- Privacy-Preserving Safety Methods: Techniques for evaluating and improving system safety without compromising sensitive user data
- Autonomous Agent Oversight: Mechanisms for monitoring and controlling AI systems that operate independently, a particularly relevant frontier as AI agents become more prevalent
- High-Severity Misuse Domains: Research focused on scenarios where AI misuse can cause significant harm
The inclusion of autonomous agent oversight reflects genuine concern about next-generation AI systems that make decisions with minimal human intervention. However, external researchers cannot directly influence how these safety insights translate into actual system design.
How Can External Safety Research Complement Internal Oversight?
While the Safety Fellowship cannot replace internal safety teams, external research can serve important functions if properly structured. Consider the practical ways independent researchers can contribute to AI safety:
- Independent Verification: External researchers can conduct unbiased evaluations of AI systems without conflicts of interest, providing credibility that internal assessments may lack in the eyes of regulators and the public
- Methodological Innovation: Researchers working outside corporate constraints can develop novel testing approaches and safety benchmarks that might not emerge from internal teams focused on product deployment timelines
- Cross-Disciplinary Insights: By recruiting from cybersecurity, social sciences, and human-computer interaction, the fellowship can bring perspectives that pure AI specialists might miss when assessing real-world deployment risks
- Public Knowledge Building: External research produces published papers and datasets that advance the entire field's understanding of AI risks, rather than remaining proprietary to a single company
However, these benefits assume that external research findings actually influence internal development decisions, which the fellowship's structure does not guarantee.
The Broader AI Risk Landscape
The tension between OpenAI's external safety fellowship and its internal team dissolutions occurs within a larger context of AI risk acceleration. Separately, cybersecurity researchers have documented how advanced AI systems like Anthropic's Claude Mythos can autonomously discover software vulnerabilities, including some decades old, and chain them into devastating attack vectors. This capability, distributed through Project Glasswing to 40 organizations, illustrates how AI-driven threats are outpacing traditional defense mechanisms.
The cybersecurity implications underscore why internal safety oversight matters. When AI systems can discover and exploit vulnerabilities faster than human teams can patch them, the traditional attacker-defender timeline collapses. External research on these risks, while valuable, cannot substitute for internal teams embedded in the development process who can implement safeguards before capabilities are released.
OpenAI's Safety Fellowship represents a genuine commitment to independent research on AI risks, and the funding level is substantial. However, the program's structure and the timing of its announcement alongside reports of dissolved internal teams suggest a fundamental mismatch between stated safety priorities and organizational decisions. External researchers can study AI risks, but they cannot prevent them from emerging in the first place.