For the first time, the AlphaFold database now includes predictions of how proteins pair up and work together, not just their individual shapes. A collaboration between Google DeepMind, EMBL's European Bioinformatics Institute (EMBL-EBI), NVIDIA, and Seoul National University has added 1.7 million high-confidence homodimers (protein pairs made of two identical molecules) to the freely available database, which already holds around 200 million predictions of individual protein structures. \n\nThis expansion matters because proteins rarely work alone. Many critical biological functions depend on two or more proteins binding together and forming stable complexes. For example, HIV-1 protease, a key drug target, only becomes active when two copies of the same protein assemble into a working enzyme. Until now, the AlphaFold database could show you what a single protein looked like, but it couldn't tell you how that protein interacted with its partners. \n\nWhy Is Predicting Protein Pairs So Much Harder Than Predicting Single Proteins? \n\nProtein complex prediction introduces a layer of computational complexity that single-protein predictions don't require. When proteins interact, they change shape, move, and bind in multiple different ways depending on their cellular environment. The consortium initially generated around 30 million possible homodimer structures, then filtered them down to 1.7 million entries based on confidence thresholds. This filtering step was crucial because some predicted complexes may not correspond to biologically relevant interactions, particularly in cases where binding is temporary or context-dependent. \n\nThe computational challenge was staggering. The collaboration is centrally hosting data that would otherwise require around 17 million hours of graphics processing unit (GPU) computing to recreate. To put that in perspective, that's equivalent to running a single GPU continuously for nearly 2,000 years. By making these calculations once and storing the results in the AlphaFold database, the consortium has democratized access to protein complex predictions for researchers worldwide. \n\nHow to Access and Use the New Protein Complex Predictions \n\n \n- Direct Database Access: The 1.7 million high-confidence homodimer predictions are now integrated directly into the AlphaFold Database, which is freely available to anyone with an internet connection and has over 3.4 million users from 190 countries. \n- Bulk Download Options: An additional 18 million lower-confidence homodimers are available as a downloadable list and for bulk download from the EMBL-EBI file transfer protocol (FTP) server, allowing researchers to work with larger datasets offline. \n- Focused on Health-Relevant Species: The initial dataset prioritizes proteins from 20 well-studied organisms, including humans, mice, yeast, and disease-causing bacteria on the World Health Organization's priority pathogens list, ensuring immediate relevance for global health research. \n- Future Heterodimer Predictions: The consortium has already generated around 8 million heterodimer predictions (complexes made of different proteins), which are currently being analyzed and will be added to the database in coming months. \n \n\nMartin Steinegger, an associate professor at Seoul National University who led the computational methodology, explained the significance of this work: "By making predicted protein complexes accessible at an unprecedented scale, we are illuminating an unseen landscape of molecular interactions across the tree of life." \n\nWhat Does This Mean for Drug Discovery and Disease Research? \n\nUnderstanding how proteins interact is foundational to modern drug discovery and disease research. By visualizing protein interactions, scientists can uncover the molecular mechanisms that drive cell behavior, identify what goes wrong when someone gets sick, and develop new drugs and therapies. The dataset's focus on disease-associated bacteria and human proteins means researchers can immediately start investigating how pathogens interact with human proteins, potentially revealing new drug targets. \n\nHowever, it's important to note that the AlphaFold database remains primarily a structural reference resource. Experimental validation is still necessary for confirming that predicted complexes actually form and function as expected in real biological systems. The predictions provide a starting point, not a finished answer. \n\nJo McEntyre, interim director of EMBL-EBI, emphasized the collaborative spirit behind this release: "Science thrives on collaboration. By making this foundational protein complex dataset openly available to the world, we're inviting researchers to test, refine, and build on it to drive the next wave of biological discoveries." \n\nWhat's Next for AlphaFold? \n\nThis protein complex expansion is explicitly described as a first step. The consortium has already calculated predictions for 30 million complexes total, with plans to add more high-confidence predictions to the database in the coming months. The next major frontier is heterodimers, which involve interactions between different proteins. These are more biologically diverse and complex than homodimers, but also more computationally challenging to predict accurately. \n\nBeyond structural prediction, the field is already moving toward functional prediction. Isomorphic Labs, which built the AlphaFold 3 AI engine together with Google DeepMind, introduced IsoDDE in February 2026, a tool that predicts how drugs can bind to proteins and helps design new therapeutic molecules based on their likely effectiveness. This represents a shift from simply predicting what proteins look like to predicting what they can do and how they can be targeted. \n\nThe AlphaFold database now serves over 3.4 million users from 190 countries, making it one of the most widely used scientific resources in the world. With protein complex predictions now included, that impact is poised to expand significantly, accelerating discoveries that could lead to new medicines, new products, and a deeper understanding of life itself. "\n}