NVIDIA announced Vera Rubin, a new full-stack computing platform comprising seven chips, five rack-scale systems, and one supercomputer specifically engineered for agentic AI workloads. The platform represents a fundamental shift in how companies approach AI infrastructure, moving beyond individual components to vertically integrated systems optimized as single units. This approach, called extreme codesign, pairs software and silicon design in tandem to maximize efficiency and reduce computational costs. What Is Extreme Codesign and Why Does It Matter for AI? Extreme codesign is the practice of designing hardware and software simultaneously rather than sequentially. NVIDIA founder and CEO Jensen Huang highlighted this approach as the foundation of the company's competitive advantage, noting that NVIDIA has achieved "the best token cost in the world" through this methodology. Token cost refers to the expense of processing individual units of data through an AI model, a critical metric for determining the overall economics of running large language models (LLMs) and other AI systems at scale. The Vera Rubin platform includes several new components designed to work together seamlessly. The new NVIDIA Vera CPU anchors the system, paired with the BlueField-4 STX storage architecture. These components are not standalone products but rather pieces of a larger ecosystem optimized for agentic AI, which refers to AI systems that can autonomously plan and execute tasks with minimal human intervention. What Comes After Vera Rubin? Looking beyond the current generation, NVIDIA is already planning its next major architecture, called Feynman. This future platform will introduce the NVIDIA Rosa CPU, named after Rosalind Franklin, the scientist whose X-ray crystallography work revealed the structure of DNA. According to Huang, "As Franklin exposed the hidden architecture of life, Rosa is built to move data, tools and tokens efficiently across the full stack of agentic AI infrastructure". The Feynman generation will pair the Rosa CPU with the LP40, NVIDIA's next-generation LPU (Learning Processing Unit), along with BlueField-5 and CX10 components. These will be connected through NVIDIA Kyber for both copper and co-packaged optics scale-up, and NVIDIA Spectrum-class optical scale-out. Together, these advances target every pillar of what NVIDIA calls the "AI factory": compute, memory, storage, networking, and security. How to Evaluate AI Infrastructure Platforms for Your Organization - Vertical Integration: Assess whether your infrastructure provider designs hardware and software together as a unified system rather than combining off-the-shelf components, which typically results in lower efficiency and higher operational costs. - Token Economics: Compare the cost per token processed across different platforms, as this metric directly impacts the long-term expense of running AI models in production environments. - Agentic AI Readiness: Evaluate whether the platform is specifically optimized for autonomous AI agents that can execute complex tasks independently, not just traditional language model inference. - Scalability Architecture: Review the networking and storage components to ensure the platform can scale from single systems to data center-wide deployments without performance degradation. NVIDIA also announced the Vera Rubin DSX AI Factory reference design and the NVIDIA Omniverse DSX Blueprint, tools that allow companies to simulate AI factories in software before building them physically. DSX Air, part of the broader DSX platform, enables organizations to model their infrastructure investments and optimize configurations before deployment, reducing risk and capital expenditure. The company is also extending its reach beyond Earth. NVIDIA announced plans to bring AI data centers into orbit through systems like NVIDIA Space-1 Vera Rubin, extending accelerated computing from terrestrial facilities to space-based infrastructure. This represents a significant expansion of where AI computation can occur and opens new possibilities for latency-sensitive applications and distributed AI systems. During the keynote at GTC 2026, Huang emphasized the scale of current AI demand. He noted that computing demand for NVIDIA GPUs has increased by approximately one million times over recent years, and he projects at least one trillion dollars in revenue from AI infrastructure between 2025 and 2027. This explosive growth reflects the rapid expansion of AI adoption across industries and the corresponding need for more efficient, purpose-built infrastructure. The shift toward extreme codesign and vertically integrated platforms represents a maturation of the AI infrastructure market. Rather than treating AI as a software problem that can run on generic hardware, leading companies now recognize that optimal performance requires hardware and software to be engineered as cohesive systems. This approach mirrors historical patterns in computing, where specialized architectures consistently outperformed general-purpose alternatives for specific workloads. For organizations evaluating AI infrastructure investments, understanding these architectural principles is essential for making decisions that will remain relevant as AI workloads continue to evolve and scale.