Two European deep-tech companies have cracked a major bottleneck in AI deployment: they can now compress massive AI models down to 95% smaller while keeping them nearly as smart, making it possible to run sophisticated artificial intelligence directly on small devices without constant cloud connections. Multiverse Computing, headquartered in Spain, and Dutch chipmaker Axelera AI announced the partnership on March 18, 2026, positioning compressed AI models on Axelera's Metis and forthcoming Europa hardware platforms. The collaboration addresses a real-world problem facing enterprises: many organizations need AI to work in places where cloud connectivity is unreliable, expensive, or simply unavailable. Think of a factory floor with spotty internet, a retail store processing video in real time, or a smart city sensor network operating during network outages. Running AI locally solves these challenges. How Does Model Compression Actually Work, and Why Does It Matter? Multiverse's CompactifAI technology achieves the heavy lifting here. The company can shrink large language models (LLMs), which are the AI systems behind tools like ChatGPT, by up to 95% in size while losing only 2% to 3% of their accuracy. To put that in perspective, an LLM that normally requires gigabytes of memory and specialized hardware can now fit on a smartphone, industrial controller, or edge device with a fraction of the power consumption. This compression matters because smaller models demand less memory, require cheaper chips, and consume dramatically less energy. In environments where a constant connection to cloud infrastructure is impractical or prohibitively expensive, this efficiency shift changes the entire economics of AI deployment. What Real-World Problems Does This Partnership Solve? The integration of Multiverse's compressed models into Axelera's hardware platforms enables workloads that normally require "datacenter-class infrastructure" to run on compact devices. The companies are targeting several specific use cases: - Industrial Operations: Factories and manufacturing plants can run computer vision systems and predictive maintenance AI without relying on cloud uploads, keeping sensitive operational data on-site and maintaining uptime even during network failures. - Retail Environments: Stores can process customer behavior, inventory management, and security footage locally, reducing bandwidth costs and enabling real-time decision-making at checkout and shelf level. - Mobility and Autonomous Systems: Vehicles and robots can make split-second decisions using local AI inference, critical for safety-sensitive applications where latency from cloud round-trips is unacceptable. - Defense and Smart Cities: Government and municipal deployments can maintain data sovereignty while running advanced AI analytics across distributed sensor networks. Enrique Lizaso, co-founder and CEO of Multiverse Computing, explained the strategic vision: "Our mission is to make state-of-the-art AI radically more efficient and accessible. By combining Multiverse's advanced compressed AI models with Axelera's high-performance edge platforms, we can bring powerful reasoning capabilities to devices where latency, privacy and energy consumption are critical". Why Is On-Device Fine-Tuning a Game-Changer? Beyond just running inference (the process of using a trained model to make predictions), the partnership also enables on-device fine-tuning. Fine-tuning means adjusting a pre-trained model with new data specific to a particular organization or use case, without sending that sensitive data to the cloud. Historically, fine-tuning at the edge has been technically demanding because it requires significant memory and computing power. Compression changes this equation. By reducing the base model size, organizations need less headroom to adjust and customize the model locally, making fine-tuning feasible on resource-constrained devices. This capability is especially valuable for enterprises handling regulated data. A healthcare provider can fine-tune a medical AI model on patient records without ever uploading that information off-premises. A financial institution can customize fraud detection without exposing transaction data to third parties. How Does This Fit Into Europe's AI Strategy? Both companies are backed by the European Innovation Council, and they've explicitly framed this partnership around European technological sovereignty. The idea of a "sovereign" AI stack has gained prominence as governments and enterprises reassess their dependencies on US hyperscalers and non-European chip supply chains. Ekaterina Zaharieva, Commissioner for Startups, Research and Innovation, underscored the broader significance: "Europe's competitiveness in the next decade will depend on our ability to combine world-class chips with trustworthy, efficient AI. Collaborations like the one between Multiverse Computing and Axelera AI, both supported significantly by the European Innovation Council and its Fund, show how European deep-tech companies, when connected to each other, work together to deliver sovereign strategic digital technologies that are developed and scaled in Europe while serving global markets". The partnership directly addresses European policy goals around reducing geopolitical exposure and supply chain risk. By developing and manufacturing both the compressed AI models and the hardware acceleration technologies within Europe, the companies aim to reduce dependence on non-European infrastructure while empowering regional industry and public institutions. What Are the Practical Benefits for Organizations? The collaboration enables three concrete advantages for customers deploying AI at scale: - Ultra-Efficient Inference: Run complex AI models on low-power edge devices with dramatically reduced energy consumption, lowering operational costs and enabling deployment in power-constrained environments like remote sensors or battery-powered devices. - Local Fine-Tuning: Customize models with proprietary data without uploading sensitive information to the cloud, preserving privacy and ensuring compliance with regulations like GDPR or HIPAA. - Cost-Effective Fleet Scaling: Deploy AI across large numbers of devices at a lower cost per unit, since each device requires less powerful (and therefore cheaper) hardware to run the compressed models effectively. Fabrizio Del Maffeo, co-founder and CEO of Axelera AI, noted the expanded opportunity: "Axelera AI is committed to delivering the most powerful and efficient AI inference solutions to the world. Enabling Multiverse Computing's compressed AI models to run on our Metis and future Europa platforms will unlock new classes of applications for our customers, from industrial and retail to mobility, defense, smart cities and more". What Happens Next? The companies have completed the technical integration phase and are now moving into a dedicated commercialization phase. Multiverse, headquartered in Donostia-San Sebastian in Spain with offices across Europe, the US, and Canada, will work with Axelera, based in Eindhoven with staff across several European countries plus Taiwan and the US, to bring the combined solution to market. The timing is significant. As agentic AI systems (autonomous AI agents that can reason and make decisions independently) become more prevalent, edge inference is becoming increasingly critical. A separate report from InterDigital and ABI Research warns that agentic AI will generate continuous upstream data from smart glasses, wearables, smartphones, and IoT devices, potentially straining mobile networks unless intelligence is distributed across devices and edge infrastructure. The Multiverse-Axelera partnership directly addresses this architectural shift by making it practical to embed powerful AI reasoning at the edge.