Japanese Students Win International AI Chemistry Competition by Predicting Molecular Light Behavior

Three doctoral and master's students from Tokyo's Ohue Laboratory have won first prize at a major international machine learning competition by building an AI system that predicts spectroscopic properties of chemical compounds with remarkable accuracy. Their achievement signals a major shift in how scientists approach one of chemistry's most time-consuming challenges: understanding how molecules interact with light. The competition, organized by EU-OPENSCREEN and the Society for Laboratory Automation and Screening (SLAS), attracted teams from around the world and focused specifically on predicting transmittance, a measure of how much light passes through a substance .

Why Can't Scientists Just Test This Stuff in the Lab?

Traditionally, chemists determine spectroscopic properties through physical experimentation, which is expensive, time-consuming, and requires synthesizing actual compounds. This creates a bottleneck in drug discovery and materials science, where researchers need to screen thousands of potential candidates quickly. Machine learning offers a shortcut: if you can train an AI system on existing data, it can predict properties for new compounds without requiring lab work. The challenge is building a model that generalizes well beyond the training data, meaning it works on compounds it has never seen before .

Team Yumiz, consisting of second-year doctoral student Kairi Furui, second-year doctoral student Apakorn Kengkanna, and second-year master's student Koh Sakano, all from the Department of Computer Science, solved this problem using an approach that combines multiple types of molecular information. Rather than relying on a single machine learning model, they built what's called a weighted ensemble, which blends predictions from different algorithms to achieve better accuracy .

How to Build a Better Molecular Prediction System

  • Multi-Dimensional Molecular Data: The team incorporated 1D (linear sequences), 2D (structural diagrams), and 3D (spatial arrangements) representations of molecules, capturing different aspects of chemical structure that influence how light interacts with the compound.
  • Hybrid Model Architecture: They trained both tree-based models, which excel at finding patterns in structured data, and deep learning models, which can discover complex nonlinear relationships that simpler algorithms might miss.
  • Rigorous Validation Method: The team used 5-fold Murcko scaffold cross-validation, a technique that ensures the model performs well on chemically diverse compounds rather than just memorizing patterns from the training set.

This combination proved superior to simpler approaches. The team received their award on February 9 at a ceremony during the SLAS2026 International Conference and Exhibition in Boston .

"The findings of Team Yumiz demonstrate the international competitiveness of young researchers driving the advancement of data-driven science, while also reaffirming the potential for applying machine learning technologies to the field of chemistry," stated Associate Professor Masahito Ohue.

Associate Professor Masahito Ohue, School of Computing at Science Tokyo

The significance of this win extends beyond the competition itself. Japan has designated "AI for Science" as a key policy priority, recognizing that artificial intelligence can accelerate scientific discovery across multiple fields. The success of Team Yumiz demonstrates that young researchers in Japan are developing internationally competitive capabilities in data-driven science .

What Does This Mean for Drug Discovery and Materials Science?

Predicting spectroscopic properties is not just an academic exercise. In drug discovery, understanding how a molecule absorbs and transmits light helps researchers identify promising candidates before investing in expensive synthesis and testing. In materials science, the same capability accelerates the development of new polymers, semiconductors, and other advanced materials. The ability to screen thousands of virtual compounds computationally, then synthesize only the most promising ones, could cut development timelines and costs significantly .

The broader context matters too. While pharmaceutical companies and research institutions have invested heavily in AI-driven drug discovery, the tools and methodologies remain concentrated in well-funded labs. Team Yumiz's approach, which combines publicly available machine learning techniques with careful experimental design, suggests that these capabilities are becoming more accessible to academic researchers and smaller organizations .

Looking ahead, Science Tokyo anticipates applying these findings to real-world data and implementing them through industry-academia collaboration. This suggests the university is already in conversations with pharmaceutical and materials companies about commercializing the approach. The next phase will be testing whether the model's predictions hold up when applied to proprietary compounds and novel chemical spaces that differ from the training data .

The win also reflects a broader trend in computational chemistry: the shift from rule-based systems that encode human chemical knowledge to data-driven systems that learn patterns directly from examples. As more spectroscopic data becomes available and computational resources become cheaper, this approach will likely become standard practice in chemistry labs worldwide.