The New Race to Make AI Understand Every Language: Why Under-Resourced Languages Are Getting Their Moment

The vast majority of AI language models are trained primarily on English and a handful of wealthy-nation languages, leaving billions of people with AI tools that don't understand their native speech, text, or cultural context. Now, a new initiative is directly tackling this gap. Dr. Atnafu Lambebo Tonja has been appointed as the second Google DeepMind Academic Fellow at University College London (UCL), starting in March 2026, with a specific mandate: develop AI systems and evaluation methods that work for under-resourced languages and the communities that speak them .

This appointment signals a meaningful shift in how major AI labs are approaching the global language problem. Rather than treating multilingual AI as a secondary feature, UCL and Google DeepMind are positioning it as foundational research worthy of dedicated fellowship support. Dr. Tonja's work will focus on building large language models tailored for low-resource languages, creating culturally relevant evaluation benchmarks, and exploring multimodal approaches that combine speech, text, and vision-language models .

What Makes Under-Resourced Languages So Difficult for AI?

Training AI models requires massive amounts of text data. English has billions of words available online, scraped from websites, books, and social media. But for many languages, especially those spoken primarily in developing regions, that data simply doesn't exist at scale. A language model trained on limited data tends to perform poorly, making errors more frequently and understanding context less reliably than models trained on abundant data.

Dr. Tonja's research directly addresses this challenge. His recent work includes developing Afri-MCQA, a multilingual African cultural question-answering benchmark, and InkubaLM, an African-focused small language model designed to work with limited training data . In 2024, his work on these projects earned him the EMNLP Outstanding Paper Award, one of the top honors in natural language processing research.

The problem extends beyond text. Many under-resourced languages lack standardized ways to evaluate whether an AI system actually understands them correctly. Dr. Tonja's research includes developing culturally relevant evaluation benchmarks, which means creating tests that measure whether an AI system understands not just the words, but the cultural context and nuances embedded in a language .

How Multimodal AI Could Bridge the Language Gap

One of the most promising approaches Dr. Tonja is exploring involves combining multiple types of input and output: speech, text, and images. This multimodal approach could be particularly valuable for languages with limited written text online. If an AI system can learn from audio recordings, video content, and images alongside text, it has more sources of information to draw from, potentially reducing the amount of pure text data needed .

This strategy aligns with broader industry trends. Mistral AI, a French AI company, recently released Voxtral TTS, an open-source text-to-speech model that supports nine languages, including English, French, German, Spanish, Dutch, Portuguese, Italian, Hindi, and Arabic . The company is explicitly building toward what it calls an "end-to-end platform that can handle multimodal streams of input, including audio, text, and image and output as well" .

"We plan to have an end-to-end platform that can handle multimodal streams of input, including audio, text, and image and output as well. The main benefit of that is you get way more information with an end-to-end agentic system that supports audio as an input or output," said Pierre Stock, VP of Science Operations at Mistral AI.

Pierre Stock, VP of Science Operations at Mistral AI

For under-resourced languages, this multimodal capability could be transformative. A language with limited written text but rich oral traditions or video content could potentially be better served by AI systems that learn from all three modalities simultaneously.

Steps to Building More Inclusive AI Language Systems

  • Develop Low-Resource Language Models: Create large language models specifically designed and trained for languages with limited online data, using techniques like transfer learning from high-resource languages and data augmentation strategies.
  • Build Culturally Relevant Benchmarks: Design evaluation tests that measure whether AI systems understand not just vocabulary and grammar, but the cultural context, idioms, and nuances specific to each language community.
  • Integrate Multimodal Learning: Combine speech, text, and vision inputs so AI systems can learn from audio recordings, videos, and images alongside written text, reducing dependence on scarce written data.
  • Support Cross-Lingual Transfer: Use knowledge from high-resource languages to help train models for related low-resource languages, allowing AI systems to leverage linguistic similarities across language families.

Dr. Tonja's appointment comes at a moment when the AI industry is increasingly recognizing that language diversity is not a niche concern. His previous work has been published in leading venues including ACL, NAACL, EMNLP, NeurIPS, and TACL, indicating that multilingual and low-resource language research is gaining prominence in mainstream AI research .

The partnership between UCL and Google DeepMind on this issue runs deep. Google DeepMind CEO Sir Demis Hassabis is a UCL alumnus, and the company has supported more than forty master's scholars, six PhD candidates, and established a chair in machine learning at the university . Dr. Tonja is the second Google DeepMind Academic Fellow appointed to UCL, following Dr. David Adelani, who is now an assistant professor at McGill University in Canada and was recently named a Schmidt Sciences AI 2050 Early Career Fellow .

"I'm delighted to welcome Atnafu to the UCL AI Centre. Atnafu brings deep technical knowledge and a passion to develop AI for good and to share AI skills widely," said Professor David Barber, Director of the UCL Centre for AI.

Professor David Barber, Director of the UCL Centre for AI

The broader context matters here. As AI systems become more integrated into education, healthcare, business, and government services globally, the languages these systems support directly determine who benefits and who gets left behind. A doctor in Nigeria, a student in Vietnam, or a small business owner in Peru cannot effectively use AI tools that don't understand their language. By investing in research specifically aimed at under-resourced languages, UCL and Google DeepMind are addressing a fundamental equity issue in AI development .

This work also has practical implications for enterprises. As companies expand into new markets, they need AI systems that can handle customer support, content creation, and data analysis in local languages. Building better language models for under-resourced languages isn't just about equity; it's also about market opportunity and business capability in regions where English-only AI tools are insufficient.