How DeepMind's Raia Hadsell Built a World Model Powerhouse: From 30 People to 1,200 Scientists
Raia Hadsell has spent nearly 13 years at DeepMind transforming a small research group into a sprawling organization of 1,200 scientists and engineers across 10 labs, all while developing practical AI systems that solve real-world problems. Her work spans weather prediction models that outperform traditional physics-based approaches, multimodal AI systems that process text, video, and audio simultaneously, and interactive 3D environments designed to train both human and robotic intelligence. Unlike many AI breakthroughs that remain theoretical, Hadsell's projects have moved from research papers into tangible applications that demonstrate how scaling teams and technology together can drive innovation forward .
What Does It Take to Scale an AI Research Organization?
When Hadsell started at DeepMind, the organization was a lean operation of roughly 30 to 40 people. Today, she co-leads a team that has grown more than 30-fold in size. This expansion didn't happen by accident; it required deliberate focus on maintaining core values while embracing growth. Hadsell emphasized that scaling isn't simply about hiring more people, but about preserving the collaborative spirit that made the original team effective .
"Growing a team is not just about numbers, but also about quality," stated Raia Hadsell.
Raia Hadsell, Co-lead of 1,200-person research group at DeepMind
The challenge of managing such rapid growth extends beyond team dynamics. As organizations scale, technical debt accumulates, communication becomes more complex, and maintaining alignment on research priorities becomes harder. Hadsell's approach has been to foster collaboration across the 10 labs while ensuring that every team member feels valued and heard. This human-centered approach to scaling offers a counterpoint to the common narrative that bigger teams automatically produce better results .
How Are Multimodal Models Changing What AI Can Do?
One of Hadsell's most significant contributions is advancing multimodal AI systems, exemplified by Gemini Embeddings 2. This model represents a fundamental shift in how AI processes information. Rather than handling text, images, video, and audio as separate inputs, multimodal systems integrate all of these data types simultaneously, creating a more holistic understanding of information .
Gemini Embeddings 2 operates with specific technical constraints that users need to understand before deployment. The system can process up to 8,800 tokens of text, roughly equivalent to 6,000 words. It can handle video inputs up to 128 seconds long and audio inputs up to 80 seconds. These limits matter because exceeding them requires breaking content into smaller chunks, which can degrade model performance and increase computational costs .
- Text Capacity: Processes up to 8,800 tokens, roughly equivalent to 6,000 words of text input
- Video Processing: Handles video inputs up to 128 seconds in length, enabling analysis of short-form video content
- Audio Integration: Processes audio inputs up to 80 seconds, allowing the model to understand spoken language and sound
The practical implication is that multimodal models like Gemini Embeddings 2 offer incredible flexibility for projects that need to combine different types of data, but they require careful planning. Users who don't understand these constraints upfront often encounter performance issues or unexpected costs when their projects exceed the model's limits .
Why Are AI Weather Forecasting Models Outperforming Traditional Physics?
Perhaps the most striking achievement in Hadsell's portfolio is the development of Graphcast and Gencast, AI models designed for weather prediction. These systems use graph neural networks, a type of AI architecture that excels at understanding relationships between connected data points, to forecast weather patterns. What makes this breakthrough significant is the performance gap: Gencast outperforms traditional physics-based weather models in 97% of evaluations .
This result challenges a long-held assumption in meteorology that physics-based models, which encode centuries of understanding about atmospheric dynamics, would always outperform machine learning approaches. The success of Gencast suggests that AI can learn weather patterns from historical data in ways that complement or even exceed traditional methods. However, this comes with trade-offs. The models are more computationally intensive than some traditional approaches, and they require careful validation to ensure they perform reliably in edge cases and extreme weather events .
Hadsell also developed Functional Generative Networks (FGN) specifically for cyclone prediction. By combining generative models with functional data analysis, a statistical technique for analyzing continuous data, FGN improves both prediction accuracy and response time. This matters because cyclone forecasting requires speed; faster predictions give communities more time to prepare and evacuate .
How to Implement Advanced AI Systems Without Cost Overruns
- Assess Resource Requirements Upfront: Before deploying 3D simulations or complex multimodal models, thoroughly evaluate your computational needs and budget constraints to avoid unexpected cost escalation
- Understand Technical Limits: Know the specific constraints of your AI model, such as token limits for text or duration limits for video, and plan your data pipeline accordingly
- Balance Precision and Efficiency: Recognize that higher model accuracy often comes with higher computational costs; determine the minimum acceptable performance level for your use case
What Role Do Interactive 3D Environments Play in AI Development?
The Genie project represents another frontier in Hadsell's work: creating interactive 3D environments for training AI agents. These simulations allow researchers to develop and test AI systems in controlled settings before deploying them in the real world. The environments are designed to be realistic enough to teach useful behaviors while remaining computationally manageable .
Interactive 3D simulations have applications for both human and robotic intelligence. For robotics, they enable researchers to train systems on tasks like manipulation, navigation, and object recognition without the cost and risk of physical hardware. For human-AI interaction, they create spaces where AI agents can learn to communicate and collaborate with people. However, these environments are resource-intensive. Running high-fidelity 3D simulations requires significant computing power, and costs can escalate quickly if not carefully managed .
"Direct cyclone prediction with FGN improves accuracy and operational efficiency," noted Raia Hadsell.
Raia Hadsell, Co-lead of 1,200-person research group at DeepMind
The lesson from Hadsell's experience with 3D environments is that powerful tools require disciplined resource management. Researchers who dive into interactive simulations without first assessing their actual computational needs often find themselves facing unexpected bills or performance bottlenecks. The key is to start with clear objectives and scale the simulation complexity only as needed .
What Does Hadsell's Journey Reveal About the Future of AI Research?
Hadsell's 13-year arc at DeepMind illustrates several broader trends in AI development. First, the most impactful AI research increasingly focuses on practical applications rather than theoretical breakthroughs alone. Weather forecasting, cyclone prediction, and robotic training all solve concrete problems that affect real people. Second, scaling research organizations requires as much attention to human factors as to technical innovation. Growing from 30 to 1,200 people while maintaining research quality is a management challenge as much as a technical one. Finally, the most advanced AI systems today are multimodal and integrated, combining different types of data and different technical approaches to achieve results that no single method could achieve alone .
For organizations looking to adopt these technologies, the takeaway is clear: advanced AI systems offer genuine value, but they require careful implementation. Understanding the constraints of multimodal models, the trade-offs between accuracy and computational cost, and the resource demands of interactive simulations is essential to avoiding costly mistakes. Hadsell's work demonstrates that the frontier of AI isn't just about building bigger models; it's about building systems that work reliably in the real world.